I have a project which will basically be an auction platform.
At any given time there could be 1000-10,000 active auctions, but there will only ever be around 10 bidders on each auction.
Auctions would typically last around 10min up to 2 hours.
Typically all auctions and bidding in the system would happen within a 4 hour window, otherwise all auctions are closed.
What is the best way to structure this type of application?
My initial thinking is to have Auctions and Bids stored in the database.
For every Auction there will also be a genserver which would have an internal tic (Process.send_after) which would check every second to see if the auction should have ended yet.
Once it has ended it could close the auction and stop bidding.
If the genserver crashes it could rebuild from the database.
And It would reduce calls to the server with the internal tic which is checking if the auction has ended instead of querying the db.
From there I could also accept bids into the genserver and only store that data once it has ended, further reducing load on the DB if I wanted.
Are there any drawback or risks with this scenario?
Without genserver, I guess I could have a cron that runs every second(ish) and updates all auction status.
Would this be more simple? does this scale better or worse?
Is there any easier way, or better way to do this given the scale I am looking at?
Lastly, if this was to scale to 50,000 auctions and hundreds of bidders.does this change things at all?
A couple of thoughts (I’m not an auction expert):
I would think bidders expect a high precision on the start and end times, so a
process_after/2 is likely not the right mechanism to determine if an auction is open of not
Preserving order of bids is presumably very important in an auction
Latency of bid acceptance into the bidding queue is important - ie predictable and consistent queuing and ordering under load (for which of course the BEAM is well suited)
Intuitively I think of this as:
GenServer per auction to provide the ordering
- Phoenix PubSub or similar to broadcast state of the auction to all bidders with low latency and soft real-time characteristics
- Each bid has a time stamp applied on entry to the system and when pulling a bid of the queue (in the
handle_call) then use the time stamp to decide if the bid is within the valid period of the auction. This also gives you a better audit record of bids and the times they entered the system
I see complexity in the area of:
- A valid bid is entered and its the new high bid
- You want to store all the bids for audit purposes, this takes time
- You want to broadcast the high bid to the other bidders in real time as fast as possible
- (2) and (3) have a big of conflict in order to keep consistent state, reliably and also broadcast the highest bid as fast as possible.
My understanding is that in auctions, the action can get fast and furious in the closing minutes/seconds hence a good solution to (4) might be important. I’m not the person to help you with that
I got pretty far building an auction app for my fantasy football league (so, you know, very critical that we achieve high precision! ). It was a learning exercise so I built it with the idea that it could handle multiple auctions simultaneously.
I did not bother to implement timestamps but otherwise it looked a lot like what is described here and that idea makes sense. The only thing I would add is that I used ETS for persisting the data beyond the
GenServers. Obviously ETS can die if you lose the machine or the BEAM otherwise crashes, but I can imagine that being your first stop given the speed with which I understand one can read/write to ETS. You could also still have a separate database that you update asynchronously (only reading from that if the entire BEAM crashes or to dump to a data warehouse for querying) as a second backup.
Point being you will still have the consistency problem, but the durability tradeoff is just that, I think. I assumed (but did not test) that going back and forth to ETS resulted in much reduced latency with only a minimal loss of durability relative to using a separate database. Of course if you get to a scale that requires you to distribute your auction processes across multiple nodes, ETS would be off the table.
is up to 10,000 over 2 hours (creating auctions) that much though?
If I was just using ecto to manage the auctions and bids, would it fail?
You’re in the last 2 seconds of your auction. There are 100 bidders online. Some might even be doing algorithmic bidding. The most important thing is that each bidder has the current bid price in as near to immediate time as possible. The second thing is that you can accept as many bids as quickly as possible with similar latency and response so all bidders are treated the same. Third thing is to keep auction integrity - everyone sees the same information at the same time, each bid is treated equally, bids are accepted to the last microsecond, the highest bidder wins etc etc.
Synchronising persistent state at the same time as delivering on those core auction requirements is challenging.
I think thats what @srowley and I are both saying (and apologies if I misinterpreted your reply).
I would not expect
10_000 auctions itself to be a special consideration.
I’ve worked on auction websites for the past 10 years or so. (Not with Elixir for the core logic.)
A few random things that could be worth thinking about:
You want to be very sure to avoid race conditions, especially towards the end of the auction. If two people both place a bid, and then the closes-at time has passed, you don’t want to tell both they won. Consider employing stuff like bid queues and DB transactions.
Consider how you want to handle unsold, relisted auctions if that ever happens. Should they just be two independent Auctions in DB that are mostly identical? Or should they share some stuff?
If you have a bid ladder (must increase bids by at least $10) and max bids (like eBay, where I can say I want to bid up to $1000 and the site will bid for me up to that amount), consider whether max bids need to adhere to the bid ladder steps or not. Say the current bidding is at $10 and I place a max bid of $31, if the site allows it. Now the current bidding will be at $20. If the next person tries to bid $30, what should happen?
The site could accept their bid, then bid on my behalf, so their bid was $30 and I lead the bidding at $31.This is what we do – it makes for a more dynamic bidding process and lets power users strategise, which is fun. But it also causes frequent support issues of the sort “why could someone win over me with $1 if the bid ladder says I need to raise by $10”.
At which point do you think it would be best to store data to the database (or ets)?
Should a bid be accepted by the genserver and then async that into the DB right away?
Or wait until the end of the Auction and store all data before terminating the genserver?
Persisting the data on the fly sounds safer, as if the genserver crashes I could restart it more accurately from the db? Otherwise all data is lost unless I catch the crashing genserver, which i am not sure how error prone that is. What way would be more accurate?
But then am I putting undue work on the db by storing every bid right away if its not necessary?