codeRaider
Build correctly Massive Data Structure
TL;DR
Better way and Efficient methods to build a massive data structure (like a snapshot), or this is job for another language?
In-deep
In a Redis server, we have segmented objects by parameters. The base object is institute, this have a base json object with parameters like address, phone number, code, id, etc. Then, this institute have nested keys with data (json object), like institute:<code>:teachers, institute:<code>:executive and others.
There is a builder that create this massive structure by many institutes with a list of institute codes, one by one, this generate a base structure and merge and add the nested objects.
A little example:
%{
"institute-a" => %{
address: "Street A 453",
id: 12,
status: "active",
online: true,
location: %{
country: "United State",
state: "CA",
city: "Los Angeles"
},
teachers: %{
"john-doe" => %{
age: 35,
asignatures: ["Art", "Music"],
id: 999
},
...
},
courses: %{
"A1" => {
students: %{
"carol-doe" => %{
age: 15,
grades: %{
"music" => "A",
"arts" => "B+"
},
....
}
}
}
}
},
....
}
The duration to build this structure was nice (5 secs), but more institutes was added to the Redis server, already we have 500 institutes.
I try many forms, but, when the data is so massive, the request time is like 15~20secs to show this request data or send as response in a channel event.
I know elixir is not friendly with this kind of process, this is temporary, but I need ideas to be more efficient and respecting the elixir way to code.
Most Liked Responses
chulkilee
Did you find out what part is actually taking time mostly?
If building the data is 5 sec but the whole request takes 15-20.. it means something takes 10-15s ![]()
Things to check before going further (e.g. rustler)
- Use iolist not string for redis command (if the library supports)
- Check Redis library/server setting which makes thing slow - especially redis response time (not your app response time)
- Do not rebuild data again
- Avoid copying large data across erlang nodes - maybe better to refetch from phx channel process
chulkilee
Okay then you have N+1 query problem - this is data fetching problem not necessarily elixir.
You may make it faster by performing N+1 queries in parallel (use Task for instance) - but in this case you loose the benefit of Redis pipelining.
Anyway you’re accessing the data not in “Redis way”. You may prefetch most values if knows the pattern in advance - e.g. using SCAN or it’s family.
Or you can use Lua script to preprocess a little bit on redis side.
As you have redis, you may use redis for caching as well - which is helpful if transformation is expensive compared to network bandwidth between erlang nodes and redis instance.
darnahsan
You can have a look at this it might be of some help or atleast give you idea over hwo to tackle your problem








