So in #42, one issue that comes up multiple times is cache misses.
Basically, Registrymanager::get_package
is racy:
- look up package in cache, if it's there, return it
- if not, fetch the package
- insert package into cache
This works but has the property of a "race condition." (but not a data race!!!) If we spin up two threads, and both are looking for the same package, and thread 1 goes "1, don't have it, go to step 2 and fetch" and then thread 2 starts and does the same, we've fetched the package twice. This manifests as a cache miss.
The alternative is to cause 1 to block in some fashion. We want it to hold some sort of lock, so that if someone else wants something we're working on, we ask them to wait until it's ready. But there's not just one way to do this!
The first way, and the way that's most obvious, is that you put a mutex around the whole thing. That way, if someone is downloading a new package, you wait until they're done. The issue there is that you'd end up blocking everyone when a package is being downloaded, which kind of defeats the purpose of downloading them in paralell, haha.
So my next thought is that we need to do is change the data modelling from "package exists or not" to "package doesn't exist, package is being downloaded, package is complete in the cache." This more properly represents what's going on in the real world. But it's not actually the best way forward. Why? Well, what happens when we have a cache hit? We want to simply return the package. If we know that some other task is already downloading the package, what we actually want is to return. We have no more work to do! Sure it's not ready yet, but the other task will make sure it is.
So, I think that's the task to fix this issue. I don't think it should be a super complex diff, but I'm not going to try and write a patch right this second.