yammdb - just another .mmdb
Hey,
Today I wanted to present you, yammdb.
Which is another, different, geodatabase, based on real measurements.
You can find it here: https://github.com/Ne00n/yammdb
The Database is build weekly, on fridays.
As long the buildserver doesn't blow up.
The primary use case for me is to compare existing geo data, maybe someone of you will find it useful.
If you have any ideas and feedback, please lemme know.
Enjoy.
This discussion has been closed.
Comments
Latency is now included from the closest location.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Umm I'm stupid and new to all this. What is this???
Put in a IP, it'll tell you where the physical location of the server/host of the IP is.
Websites have ads, I have ad-blocker.
Oh cool!
I'm trying this out with my gDNSd servers. The db seems much smaller than the ip-to-city-lite db I have been using, 16MB compared to 99MB. If this works as well it is going to save me a bunch of RAM.
Thank you for this !!
For staff assistance or support issues please use the helpdesk ticket system at https://support.lowendspirit.com/index.php?a=add
It has way less "useless" data on it, hence its so smol.
However, It should work with auto_dc maps using geo cords but I have no idea how accurate it is.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Thank you! I hope there is an acl version for bind9
I wrote a smol, 30 lines benchmark script, using a dump of the global routing table.
65%, 638k from 975k in the routing table, which is good, I expected less.
Github page said 600k.
Meanwhile the other .mmdb's have a 99% hit rate.
TLDR: Yes use it, but not as primary database, build your own, as my primary purpose was.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Fair enough.
For staff assistance or support issues please use the helpdesk ticket system at https://support.lowendspirit.com/index.php?a=add
if you build your own, you prob, can drop the memory usage even further.
Throw everything out and turn it into a flying gas can.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Speaking of hit rate, the benchmark I ran yesterday, gave me a 65% hit rate, against a routing table dump.
Which is more than I expected, roughly 648k from 975k, however, the second benchmark I ran, hit only 35% with 8.5 Million IP addresses.
This was due, that bigger subnets are splitted into smaller ones for more accurate data, however the ones which didn't respond where not filled, which has been fixed.
After fixing these bugs, the hit rate is at 74%.
The next step would be to see, where I can further improve the data I use for all of this, so I end up with higher hit rates.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Thanks to https://virtury.com/ we got a new Probe in Pakistan!
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
I took a bit longer than expected, however the software is now mostly optimized for more probes.
Expect more probes in the next weeks.
Daily test builds, not guaranteed, will be available under https://yammdb.serv.app/test.mmdb
Weekly build will happen as usual.
Also, thanks to https://ginernet.com for a new probe in Madrid.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
It was supposedly to be already done, however I did a fuck up.
One function had the build hang for hours over hours.
This is fixed, thanks to GPT4 once again.
It should finish in a bit, once this is done, I will make a second test build, including a bunch of new locations.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
i noticed its now smaller than the previous version i tested. did you reduce the coverage or change the format?
however, cool and useful idea to crosscheck other geolocators.
No, but I did noticed it too.
The only way I can explain it, is how the writer builds the database.
Basically I tried to aggregate the prefixes, to make the database even smaller, however it seems like the writer already does this.
So the size did not change after all, a while ago, the database had a lot of gaps, because the way it does ping bigger subnets.
These gaps have been closed, hence I do assume, that the writer now can optimize / aggregate the database even further, hence its smaller. The code definitely does not or has remove data.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
I added a few more Locations for this test run.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Gonna be the biggest Friday run, yet.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Thanks to some people that followed my github repo, they actually gave me the idea, to make an mtr only geo database.
I did code it in less than 24 hours, however, the hit rates where to low and my brain did not manage to figure out yet where the fuck up was.
However, today I found the mapping error.
db/mtr.mmdb {'fail': 126502, 'success': 849072, 'percentage': 87.03306976200678}
From 64% to 74% now 87% hitrate, not bad.
I put the .mmdb as usual on https://yammdb.serv.app/mtr.mmdb
This database is only 4.2MB in size, only contains geo coordinates, right now.
I will add the usual info in a later build, such as country, continent etc.
Plus I will add a combined build later, with geo.mmdb and mtr.mmdb which first uses latency, then mtr for better accuracy.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Any plans to release a CSV version of the mmdb?
Websites have ads, I have ad-blocker.
I updated the mtr.mmdb, it does now include continent, country and latency same as the geo.mmdb.
@somik Sure, I added the csv file: https://yammdb.serv.app/mtr.csv
Currently they are smaller than the geo.mmdb due to less measurements per subnet, this will change once I run them again.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
I also added geo.mmdb as csv: https://yammdb.serv.app/geo.csv
There is no compression or anything, hence the file is so big.
Usually the .mmdb writer does the compression.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Best to have it without compression for maximum compatibility. I'm visiting our neighbouring country for some good foods now, so I'll test it out once I go back to Singapore.
On that note, seems like a lot of shops closed down over the last pandemic... Sad days.
Websites have ads, I have ad-blocker.
Well, I guess a .mmtr only database with more tests per subnet, won't be happening.
It takes to long, roughly 1-2 days to finish a build with roughly 8+ million targets.
Even with 20 probes, running, at the same time.
Instead I am going to run another test build next week, which does .mtr on subnet's that doesn't ping and combines them with the latency results as mentioned before.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
@Neoon I think your CSV headers (table titles) are missing for both geo.csv and mtr.csv
Websites have ads, I have ad-blocker.
I will change that before the next build tomorraw.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
It seems that a recent masscan is mandatory, I still used a 2 months old one.
The build just finished 3 hours earlier and with +7% higher hitrate, so 80% without mtr.
Lesson learned, masscan will be updated at least once per week, gg.
As soon I get the mtr integration working It should easily get a 90%+ hitrate.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
I thought port scanning was frowned upon by most data-centers/hosts?
Btw, what's MRT?
90% hit rate as in for IPs or returning correct geo-loc/country?
Websites have ads, I have ad-blocker.
I never said I was port scanning.
hit rate means, you get a result.
Accuracy depends on the amount of locations.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Isn't masscan used for port scanning? Are you using it to scan for something else?
Websites have ads, I have ad-blocker.