LINQ performance and pitfalls
LINQ is beautiful and you may tempted to do some cool queries, but beware, it might perform poorly. Learn from these examples.
Published 9 March 2021
Example 1 – Use ToLookup instead of where clause.
Original code before optimization
//this code is mapping data to a view model... foreach (var store in stockinfo.stores) { store.stockOnlineStatus = ... store.stockOnlineText = ... //set inventory status on the stores that have inventory storeinventory.Where(w => w.store?.warehouseId == store.warehouseId) .Select(w => MapStockStatus(store, w)) .ToList(); }
LINQ Explained under the hood: First filters all with specific warehouseid, potentially many then Select will run the Map function, ToList is there to actually run/execute the query. So in short a LINQ to populate/map some stock data on store object.
Instead we use ToLookup and if statement
ToLookup is a flexible variant of Dictionary, read more further down.
var dictInv = storeinventory.ToLookup(i => i.store.warehouseId);//we want the lookup key to be warehouseId foreach (var store in stockinfo.stores) { store.stockOnlineStatus = ... store.stockOnlineText = ... //set inventory status on the stores that have inventory if (dictInv.Contains(store.warehouseId)) { MapStockStatus(store, lookupInv[store.warehouseId].ToFirstOrDefault()); } }
Performance metrics
before: 7 sec
after: down to 80 ms to max 700 ms in all cases
Example 2 – Replace Where clause with ToLookup
Code before optimization
public StorePage GetStorePageByWarehouseId(int? warehouseId) { return this.ListAllStores() .Where(s => s.WarehouseId == warehouseId) .FirstOrDefault(); }
The code was called foreach variant foreach stock warehouse.
Instead use cache and ToLookup
First i tried ListAllStores.FirstOrDefault(x => x.WarehouseId == warehouseId) which was faster but not optimal, finally used ToLookup like this:
public StorePage GetStorePageByWarehouseId(int? warehouseId) { var lookupList = GetLookupAllStoresByWarehouseId(); // this saves up to 400ms on some products index when many variants if (lookupList.Contains(warehouseId)) { return lookupList[warehouseId].FirstOrDefault(); } return null; } public ILookup<int?, StorePage> GetLookupAllStoresByWarehouseId() { var stores = _cache.Get("LookupListAllStores") as ILookup<int?, StorePage>; if (stores == null) { stores = ListAllStores().ToLookup(l => l.WarehouseId); _cache.Insert("LookupListAllStores", stores, new CacheEvictionPolicy(TimeSpan.FromSeconds(15), CacheTimeoutType.Absolute)); } return stores; }
Since the function was called inside loops for potentially 100 warehouse per variant i wanted just to call the lookup maximum once per request since ToLookup has a overhead too. Cache to the rescue, even for only 15 seconds.
Performance metrics
before: 700ms to 1s
after: 480ms
ToLookup vs ToDictionary
ToLookup is more forgiving. ToLookup allows null as key. But you may also query a lookup on a key that doesn’t exist, and you’ll get an empty sequence. Do the same with a dictionary and you’ll get an exception.
A dictionary is a 1:1 map (each key is mapped to a single value), and a dictionary is mutable (editable) after the fact.
A lookup is a 1:many map (multi-map; each key is mapped to an IEnumerable<>
of the values with that key), and there is no mutate on the ILookup<,>
interface.
An overly simplified way of looking at it is that a Lookup<TKey,TValue>
is roughly comparable to a Dictionary<TKey,IEnumerable<TValue>>
Conclusions
- Don’t use where clause with FirstOrDefault, put the critera in the FirstOrDefault instead
- if using FirstOrDefault with criteria in an loop, but it in a ToLookup dict, and query with contains
- use cache, even for some seconds
SEO Terms
- LINQ danger
- LINQ Performance traps
- LINQ poor performance
- LINQ better performance tips
About the author
Luc Gosso
– Independent Senior Web Developer
working with Azure and Episerver
Twitter: @LucGosso
LinkedIn: linkedin.com/in/luc-gosso/
Github: github.com/lucgosso