@Neoon said:
It seems that a recent masscan is mandatory, I still used a 2 months old one.
The build just finished 3 hours earlier and with +7% higher hitrate, so 80% without mtr.
Lesson learned, masscan will be updated at least once per week, gg.
As soon I get the mtr integration working It should easily get a 90%+ hitrate.
I thought port scanning was frowned upon by most data-centers/hosts?
I never said I was port scanning.
90% hit rate as in for IPs or returning correct geo-loc/country?
hit rate means, you get a result.
Accuracy depends on the amount of locations.
@tuc said:
Any database guru please help with an indexing and sql statement for best performance on searching the dataset given an IP address.
Thanks in advance
Probably if you store it in binary, the .mmdb uses a binary search tree and does lookups in a fraction of a second, also just 15MB in size compared to the csv.
@Ganonk said:
Hello Sir, can you explain to me how to use that mmdb?
i never using mmdb before. thanks
You provide a IP address, it gives you the approximate location.
Python example is on github.
@tuc said:
Any database guru please help with an indexing and sql statement for best performance on searching the dataset given an IP address.
Thanks in advance
Convert the IP/subnet (CIDR) to a IP range containing the starting and ending IP.
Convert the starting and ending IP to a "long" number.
I just made a tiny shell script to add two columns: ipfrom and ipto to the datasets made available by @Neoon. Not sure if my style of making data ready before importing into databases will help
#!/bin/bash
if [ -z "$1" ]; then exit; fi
while IFS=',' read -r ipr cont cnt lat lng rad; do
if [ -z "$ipr" ]; then break; fi
IFS='/' read -r IP CIDR <<<"$ipr"
MASK=$((-1 << (32 - ${CIDR})))
IFS='.' read a b c d <<<${IP}
base_ip=$((($b << 16) + ($c << 8) + $d))
ipfrom=$((${base_ip} & ${MASK}))
ipto=$((($a << 24) + ((${ipfrom} | ~${MASK}) & 0x7FFFFFFF)))
ipfrom=$((($a << 24) + $ipfrom))
echo "$ipr,$ipfrom,$ipto,$cont,$cnt,$lat,$lng,$rad"
done <$1
@tuc said:
I just made a tiny shell script to add two columns: ipfrom and ipto to the datasets made available by @Neoon. Not sure if my style of making data ready before importing into databases will help
#!/bin/bash
if [ -z "$1" ]; then exit; fi
while IFS=',' read -r ipr cont cnt lat lng rad; do
if [ -z "$ipr" ]; then break; fi
IFS='/' read -r IP CIDR <<<"$ipr"
MASK=$((-1 << (32 - ${CIDR})))
IFS='.' read a b c d <<<${IP}
base_ip=$((($b << 16) + ($c << 8) + $d))
ipfrom=$((${base_ip} & ${MASK}))
ipto=$((($a << 24) + ((${ipfrom} | ~${MASK}) & 0x7FFFFFFF)))
ipfrom=$((($a << 24) + $ipfrom))
echo "$ipr,$ipfrom,$ipto,$cont,$cnt,$lat,$lng,$rad"
done <$1
I never tried something like this before. Here is a transcript of the steps I tried. Maybe the transcript will be helpful to anybody who can show me my mistakes or who wants a beginner style recipe.
Step 1: Make an yammdb neighborhood
notoles@fsn:~$ mkdir neoon
notoles@fsn:~$ cd neoon/
notoles@fsn:~/neoon$
Step 2: Grab the data
notoles@fsn:~/neoon$ time wget https://yammdb.serv.app/geo.mmdb
--2023-06-04 05:17:49-- https://yammdb.serv.app/geo.mmdb
Resolving yammdb.serv.app (yammdb.serv.app)... 45.130.21.225
Connecting to yammdb.serv.app (yammdb.serv.app)|45.130.21.225|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16172538 (15M) [application/octet-stream]
Saving to: ‘geo.mmdb’
geo.mmdb 100%[==============================>] 15.42M 71.8MB/s in 0.2s
2023-06-04 05:17:49 (71.8 MB/s) - ‘geo.mmdb’ saved [16172538/16172538]
real 0m0.382s
user 0m0.013s
sys 0m0.017s
notoles@fsn:~/neoon$ ls
geo.mmdb
notoles@fsn:~/neoon$ ls -al
total 15804
drwxr-xr-x 2 notoles notoles 4096 Jun 4 05:17 .
drwx--x--x 9 notoles notoles 4096 Jun 4 05:17 ..
-rw-r--r-- 1 notoles notoles 16172538 Jun 2 17:03 geo.mmdb
notoles@fsn:~/neoon$
Step 3: Grab the program with which to read the data.
notoles@fsn:~/neoon$ ed read.py
read.py: No such file or directory
a
import geoip2.database
reader = geoip2.database.Reader("geo.mmdb")
response = reader.city("1.1.1.1")
print("Continent",response.continent.code)
print("Country",response.country.iso_code)
print("Latitude",response.location.latitude,"Longitude",response.location.longitude)
print(f"Latency {response.location.accuracy_radius}ms")
.
w
329
q
notoles@fsn:~/neoon$
notoles@fsn:~/neoon$ sudo apt-get install python3-geoip2
[sudo] password for notoles:
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
python3-maxminddb
Suggested packages:
python-maxmindb-doc
The following NEW packages will be installed:
python3-geoip2 python3-maxminddb
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 50.4 kB of archives.
After this operation, 203 kB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://deb.debian.org/debian unstable/main amd64 python3-maxminddb amd64 2.2.0-1+b1 [27.6 kB]
Get:2 http://deb.debian.org/debian unstable/main amd64 python3-geoip2 all 2.9.0+dfsg1-5 [22.8 kB]
Fetched 50.4 kB in 0s (1,435 kB/s)
Selecting previously unselected package python3-maxminddb.
(Reading database ... 56808 files and directories currently installed.)
Preparing to unpack .../python3-maxminddb_2.2.0-1+b1_amd64.deb ...
Unpacking python3-maxminddb (2.2.0-1+b1) ...
Selecting previously unselected package python3-geoip2.
Preparing to unpack .../python3-geoip2_2.9.0+dfsg1-5_all.deb ...
Unpacking python3-geoip2 (2.9.0+dfsg1-5) ...
Setting up python3-maxminddb (2.2.0-1+b1) ...
Setting up python3-geoip2 (2.9.0+dfsg1-5) ...
notoles@fsn:~/neoon$ ls
geo.mmdb read.py
notoles@fsn:~/neoon$
Step 5: Get an IP address to check
notoles@fsn:~/neoon$ w
20:22:21 up 3 days, 21:00, 1 user, load average: 0.00, 0.00, 0.00
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
notoles pts/0 187.189.238.1 20:17 0.00s 0.03s ? w
notoles@fsn:~/neoon$
Step 6: Read the data for the IP address
notoles@fsn:~/neoon$ time python3 ./read.py 187.189.238.1
Continent EU
Country CZ
Latitude 50.08 Longitude 14.41
Latency 0.0ms
real 0m0.026s
user 0m0.022s
sys 0m0.004s
notoles@fsn:~/neoon$
Step 7: Ponder
The IP address that I am using seems to be shown as located in CZ. However, the address best might be located here where I am using it, visiting, in Sonora, Mexico. Probably I made some mistakes! Is everything reported as EU/CZ because I need to add additional, basic steps? Can someone please help me understand why my geolocation result seems possibly incorrect?
I never tried something like this before. Here is a transcript of the steps I tried. Maybe the transcript will be helpful to anybody who can show me my mistakes or who wants a beginner style recipe.
Step 1: Make an yammdb neighborhood
notoles@fsn:~$ mkdir neoon
notoles@fsn:~$ cd neoon/
notoles@fsn:~/neoon$
Step 2: Grab the data
notoles@fsn:~/neoon$ time wget https://yammdb.serv.app/geo.mmdb
--2023-06-04 05:17:49-- https://yammdb.serv.app/geo.mmdb
Resolving yammdb.serv.app (yammdb.serv.app)... 45.130.21.225
Connecting to yammdb.serv.app (yammdb.serv.app)|45.130.21.225|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16172538 (15M) [application/octet-stream]
Saving to: ‘geo.mmdb’
geo.mmdb 100%[==============================>] 15.42M 71.8MB/s in 0.2s
2023-06-04 05:17:49 (71.8 MB/s) - ‘geo.mmdb’ saved [16172538/16172538]
real 0m0.382s
user 0m0.013s
sys 0m0.017s
notoles@fsn:~/neoon$ ls
geo.mmdb
notoles@fsn:~/neoon$ ls -al
total 15804
drwxr-xr-x 2 notoles notoles 4096 Jun 4 05:17 .
drwx--x--x 9 notoles notoles 4096 Jun 4 05:17 ..
-rw-r--r-- 1 notoles notoles 16172538 Jun 2 17:03 geo.mmdb
notoles@fsn:~/neoon$
Step 3: Grab the program with which to read the data.
notoles@fsn:~/neoon$ ed read.py
read.py: No such file or directory
a
import geoip2.database
reader = geoip2.database.Reader("geo.mmdb")
response = reader.city("1.1.1.1")
print("Continent",response.continent.code)
print("Country",response.country.iso_code)
print("Latitude",response.location.latitude,"Longitude",response.location.longitude)
print(f"Latency {response.location.accuracy_radius}ms")
.
w
329
q
notoles@fsn:~/neoon$
notoles@fsn:~/neoon$ sudo apt-get install python3-geoip2
[sudo] password for notoles:
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
python3-maxminddb
Suggested packages:
python-maxmindb-doc
The following NEW packages will be installed:
python3-geoip2 python3-maxminddb
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 50.4 kB of archives.
After this operation, 203 kB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://deb.debian.org/debian unstable/main amd64 python3-maxminddb amd64 2.2.0-1+b1 [27.6 kB]
Get:2 http://deb.debian.org/debian unstable/main amd64 python3-geoip2 all 2.9.0+dfsg1-5 [22.8 kB]
Fetched 50.4 kB in 0s (1,435 kB/s)
Selecting previously unselected package python3-maxminddb.
(Reading database ... 56808 files and directories currently installed.)
Preparing to unpack .../python3-maxminddb_2.2.0-1+b1_amd64.deb ...
Unpacking python3-maxminddb (2.2.0-1+b1) ...
Selecting previously unselected package python3-geoip2.
Preparing to unpack .../python3-geoip2_2.9.0+dfsg1-5_all.deb ...
Unpacking python3-geoip2 (2.9.0+dfsg1-5) ...
Setting up python3-maxminddb (2.2.0-1+b1) ...
Setting up python3-geoip2 (2.9.0+dfsg1-5) ...
notoles@fsn:~/neoon$ ls
geo.mmdb read.py
notoles@fsn:~/neoon$
Step 5: Get an IP address to check
notoles@fsn:~/neoon$ w
20:22:21 up 3 days, 21:00, 1 user, load average: 0.00, 0.00, 0.00
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
notoles pts/0 187.189.238.1 20:17 0.00s 0.03s ? w
notoles@fsn:~/neoon$
Step 6: Read the data for the IP address
notoles@fsn:~/neoon$ time python3 ./read.py 187.189.238.1
Continent EU
Country CZ
Latitude 50.08 Longitude 14.41
Latency 0.0ms
real 0m0.026s
user 0m0.022s
sys 0m0.004s
notoles@fsn:~/neoon$
Step 7: Ponder
The IP address that I am using seems to be shown as located in CZ. However, the address best might be located here where I am using it, visiting, in Sonora, Mexico. Probably I made some mistakes! Is everything reported as EU/CZ because I need to add additional, basic steps? Can someone please help me understand why my geolocation result seems possibly incorrect?
Because you didn't changed the IP in the script.
You probably want to replace the line 1.1.1.1 with response = reader.city(sys.argv[1])
which takes your argument.
Need to import sys though.
notoles@fsn:~/neoon$ time python3 ./read.py 187.189.238.1
Traceback (most recent call last):
File "/home/notoles/neoon/./read.py", line 4, in <module>
response = reader.city(sys.argv[1])
^^^
NameError: name 'sys' is not defined
real 0m0.026s
user 0m0.019s
sys 0m0.007s
notoles@fsn:~/neoon$
notoles@fsn:~/neoon$ ed read.py
331
2
i
import sys
.
w
342
q
notoles@fsn:~/neoon$ time python3 ./read.py 187.189.238.1
Continent NA
Country US
Latitude 32.77 Longitude -96.8
Latency 43.0ms
real 0m0.024s
user 0m0.024s
sys 0m0.000s
notoles@fsn:~/neoon$
The ISP here, as of the last time I checked awhile ago, seemed to use multiple hops with private addresses between my location and 187.189.238.1. Based on ping times and other factors, I previously imagined that 187.189.238.1 might be located in Missouri. Maybe it was in Dallas, or it is in Dallas now.
Vendors seem universally to identify 187.189.238.1 as a Mexican IP. I can pretty much count on being offered default pages in Spanish and default prices in pesos.
@Not_Oles said:
Vendors seem universally to identify 187.189.238.1 as a Mexican IP. I can pretty much count on being offered default pages in Spanish and default prices in pesos.
@Neoon said: I am to lazy and I don't wanna risk in accuracy because I fucked on the mapping or their was a routing issue.
ah, i see. so no triangulation but rather finding the closest datacenter.
i imagine using the latency of the ~three closest probes could be used to put the ip somewhere between them. but the calculation of course could be messed up by "non-straightforward" routing.
@Neoon said: I am to lazy and I don't wanna risk in accuracy because I fucked on the mapping or their was a routing issue.
ah, i see. so no triangulation but rather finding the closest datacenter.
Well, I find this to be the most reliable.
i imagine using the latency of the ~three closest probes could be used to put the ip somewhere between them. but the calculation of course could be messed up by "non-straightforward" routing.
I can define an area, in-between, these 3 probes, that's no issue.
Lets say I traceroute them from the closest 3.
It still doesn't tell me, where the location is.
Its off limits, since I don't have a prob there.
Yes I could ask a 3rd party, to tell me where it is and cross verify if the data given by them, is in this area.
However, I don't want to use a 3rd party.
I can try a build with that for sure, a company already reached out to me and told me I could use their data but I refused.
@Neoon said:
Lets say I traceroute them from the closest 3.
It still doesn't tell me, where the location is.
if you know the locations of the closest 3 and take the latencies as an estimate for the distances you can calculate an educated guess... https://github.com/TBMSP/Trilateration
@Neoon said:
Lets say I traceroute them from the closest 3.
It still doesn't tell me, where the location is.
if you know the locations of the closest 3 and take the latencies as an estimate for the distances you can calculate an educated guess... https://github.com/TBMSP/Trilateration
Latency is unreliable as fuck, you can't tell, when its 10ms, where it is.
Bad Routing? Still might be around the corner. Could also be not.
@Neoon said:
Lets say I traceroute them from the closest 3.
It still doesn't tell me, where the location is.
if you know the locations of the closest 3 and take the latencies as an estimate for the distances you can calculate an educated guess... https://github.com/TBMSP/Trilateration
Latency is unreliable as fuck, you can't tell, when its 10ms, where it is.
Bad Routing? Still might be around the corner. Could also be not.
I get latency that's fluctuating all the time. So best to avoid latency.
@Neoon said:
Lets say I traceroute them from the closest 3.
It still doesn't tell me, where the location is.
if you know the locations of the closest 3 and take the latencies as an estimate for the distances you can calculate an educated guess... https://github.com/TBMSP/Trilateration
Latency is unreliable as fuck, you can't tell, when its 10ms, where it is.
Bad Routing? Still might be around the corner. Could also be not.
I get latency that's fluctuating all the time. So best to avoid latency.
Usually its stable but routes can change or links can become congested.
Hence I do the measurements from all Locations in a very small time frame, usually seconds apart.
Otherwise results may differ on each build, due to these fluctuations or they even drift away, sometimes hours from each apart which makes it even worse.
Comments
You should read up what masscan can do.
Yes, as I said a few times.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Hello Sir, can you explain to me how to use that mmdb?
i never using mmdb before. thanks
Any database guru please help with an indexing and sql statement for best performance on searching the dataset given an IP address.
Thanks in advance
Probably if you store it in binary, the .mmdb uses a binary search tree and does lookups in a fraction of a second, also just 15MB in size compared to the csv.
You provide a IP address, it gives you the approximate location.
Python example is on github.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
sorry, i mean step by step running/read file .mmdb , like as apt install blabla , and than next step.
on linux , or android , or windows
Convert the IP/subnet (CIDR) to a IP range containing the starting and ending IP.
Convert the starting and ending IP to a "long" number.
For PHP, you can use:
Save it in the DB like:
index, startingIPLong, endingIPLong, Location
When you want to check a IP, convert it to "long" number, and find the row
WHERE startingIPLong>= searchIP AND endingIPLong<= searchIP
I think you'll be able to figure it out from there?
For PHP to convert the IP to long, use function: https://www.php.net/manual/en/function.ip2long.php
Websites have ads, I have ad-blocker.
Thanks @somik for the function. That's really help.
Btw, i did it other way around. Guess should not be browsing LES first thing after waking up:
WHERE startingIPLong <= searchIP AND endingIPLong >= searchIP
Websites have ads, I have ad-blocker.
Just a small typo mistake
I just made a tiny shell script to add two columns: ipfrom and ipto to the datasets made available by @Neoon. Not sure if my style of making data ready before importing into databases will help
To use the script:
output sample
That's what I did as well, just with PHP. Process the data before importing it into DB to make subsequent queries faster.
Websites have ads, I have ad-blocker.
Hi @Ganonk! Hi @Neoon!
I never tried something like this before. Here is a transcript of the steps I tried. Maybe the transcript will be helpful to anybody who can show me my mistakes or who wants a beginner style recipe.
I cut and pasted the program from https://github.com/Ne00n/yammdb
The IP address that I am using seems to be shown as located in CZ. However, the address best might be located here where I am using it, visiting, in Sonora, Mexico. Probably I made some mistakes! Is everything reported as EU/CZ because I need to add additional, basic steps? Can someone please help me understand why my geolocation result seems possibly incorrect?
I hope everyone gets the servers they want!
Because you didn't changed the IP in the script.
You probably want to replace the line 1.1.1.1 with
response = reader.city(sys.argv[1])
which takes your argument.
Need to import sys though.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Thanks so much! I will make that change and post the result.
If you wish, here is another question:
What mistake am I making here? Have I wrongly identified the file whose checksum should match? Thanks!
I hope everyone gets the servers they want!
The checksum wasn't updated since the last build, probably a issue in my build script.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Thanks!
I hope everyone gets the servers they want!
Making the change that @Neoon suggested:
Looks like we need to import the sys module.
According to findlatitudeandlongitude.com the reported latitude and longitude is in Dallas, TX USA.
The ISP here, as of the last time I checked awhile ago, seemed to use multiple hops with private addresses between my location and 187.189.238.1. Based on ping times and other factors, I previously imagined that 187.189.238.1 might be located in Missouri. Maybe it was in Dallas, or it is in Dallas now.
Vendors seem universally to identify 187.189.238.1 as a Mexican IP. I can pretty much count on being offered default pages in Spanish and default prices in pesos.
Thanks @Neoon!
I hope everyone gets the servers they want!
Yes, seems like the DB i am using also identifies your IP as mexico: https://ip2c.ziox.us/?ip=187.189.238.1
The source code is located here: https://github.com/somik123/IP-to-Country
I am currently using the "lite" database from ip2location.com
Websites have ads, I have ad-blocker.
I don't have a probe in Mexico, the closest is Texas, so its as close as it can get.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
how do you calculate the positions? i assumed you are triangulating different "nearby" probes? eg in this case the results of Texas and Sao Paulo.
If you mean by nearby, all, yes.
If 20 of them say, Texas is the lowest, its the lowest.
I am to lazy and I don't wanna risk in accuracy because I fucked on the mapping or their was a routing issue.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
ah, i see. so no triangulation but rather finding the closest datacenter.
i imagine using the latency of the ~three closest probes could be used to put the ip somewhere between them. but the calculation of course could be messed up by "non-straightforward" routing.
Well, I find this to be the most reliable.
I can define an area, in-between, these 3 probes, that's no issue.
Lets say I traceroute them from the closest 3.
It still doesn't tell me, where the location is.
Its off limits, since I don't have a prob there.
Yes I could ask a 3rd party, to tell me where it is and cross verify if the data given by them, is in this area.
However, I don't want to use a 3rd party.
I can try a build with that for sure, a company already reached out to me and told me I could use their data but I refused.
Will see.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
if you know the locations of the closest 3 and take the latencies as an estimate for the distances you can calculate an educated guess...
https://github.com/TBMSP/Trilateration
Latency is unreliable as fuck, you can't tell, when its 10ms, where it is.
Bad Routing? Still might be around the corner. Could also be not.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
I get latency that's fluctuating all the time. So best to avoid latency.
Websites have ads, I have ad-blocker.
Usually its stable but routes can change or links can become congested.
Hence I do the measurements from all Locations in a very small time frame, usually seconds apart.
Otherwise results may differ on each build, due to these fluctuations or they even drift away, sometimes hours from each apart which makes it even worse.
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
@Not_Oles many thanks for your awesam tutorial 🤝🏽 but very difficult for me try
You are welcome!
How can we make it easy for you to try?
I hope everyone gets the servers they want!
make a ready to deploy docker container with required configurations and web ui?
Websites have ads, I have ad-blocker.