1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "Jena": "673710",
  "Jena-Zentrum": "1006125655",
  "Plan B Boulderhalle Jena": "613227777",
  "Kassablanca Jena": "5752477",
  "Kulturarena Jena": "574057843",
  "University of Jena": "251848169",
  "Paradiespark Para - Jena Paradies": "228419929",
  "Ernst-Abbe-Sportfeld": "326665551",
  ...
}

How to quickly mine location IDs from Instagram’s location explorer.

Instagram's awful geotag architecture

Since Facebook’s closing of the Graph API it has become incredibly hard to mine Instagram location IDs. Until now, the most convenient method out there is either to simply use Instagrams native search (i.e. type the name of a city such as “aurajoki-turku”) and just copy the Instagram location ID of the URL, such as “246052326” from https://www.instagram.com/explore/locations/246052326/aurajoki-turku/. The same functionality is implemented by instagram-scraper. Install the Python package and type

1
instagram-scraper kölner dom --search-location

and you’ll get

1
2
3
location-id: 122730131604847, title: Kölner Dom, subtitle: Mannheim, city: Mannheim, lat: 49.48864, lng: 8.47244
location-id: 259861859, title: Kölner Dom - die offizielle Seite, subtitle: Domkloster 4, Köln, Germany, city: Köln, Germany, lat: 50.941314646747, lng: 6.958135692401
location-id: 421375188017874, title: Köln Dom Katedral, subtitle: , city: , lat: 50.941302950953, lng: 6.9581383466721

which gives an idea about the incredibly messy architecture behind Instagram locations. As every user can create these tags regardless of any existing tags, the precision or quality one can get an idea about the chaos. Technically, you can never be sure to get good results or even pick the Instagram location ID with the most posts. Also, the coordinates sometimes are completely wrong so be careful here. Anyway instagram-scraper did a good job by providing this CLI-search.

Instagram location explorer

Instagram has a quite hidden page with some of its locations sorted by countries and then some of the most common places in a country such as cities or very unique landmarks: https://www.instagram.com/explore/locations. Of course - how could it be any different - you cannot find all Instagram location IDs there but just approximately 1000. If you’d like to mine posts of different locations for the German city Jena, first click on Germany, then Jena. You’ll land on https://www.instagram.com/explore/locations/c566755/jena-germany/.

Now the logic of the page is slightly similar to the infinite scroll idea of Instagram - only that here it is not infitnite (unfortunately) and new entries must be loaded with a button. And here we go have some fun with jQuery… or not. Too early to play! jQuery isn’t loaded on Instagram and you cannot simply load it with javascrpt as described in this stackoverflow question due some security protocol. Instead, don’t waste too much time thinking about it and directly throw jQuery in your browser console i.e. from jQuery’s CDN. Copy, paste, enter and we are good to go.

Click the “More” Button as often as new links appear - but don’t do it too fast, otherwise your IP will get blocked in the blink of an eye. A 700 ms break seems reasonable and worked well for me. With this script we’ll do it 20 times which should definitely reach the maximum amount of possible page links. If there is still the “More” button after execution, just reexecute.

1
2
3
4
5
for (var i = 1; i <= 20; i++) {
    (function(index) {
        setTimeout(function() { console.log(index);$(".vLP4P").click() }, i * 700);
    })(i);
}

Extract the location IDs

Just look for all “aMwHK” classes, extract the information to a dictionary and copy to clipboard (working for Chrome).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
var loctitle = new Array();
var loc = new Array();
$('.aMwHK').each(function(){
  loctitle.push($(this).text());
  loc.push($(this).attr('href').split("/")[3]
);
})
var result = {};
loctitle.forEach((key, i) => result[key] = loc[i]);
copy(result)

Our result is a nicely formatted dictionary copied to clipboard.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "Jena": "673710",
  "Jena-Zentrum": "1006125655",
  "Plan B Boulderhalle Jena": "613227777",
  "Kassablanca Jena": "5752477",
  "Kulturarena Jena": "574057843",
  "University of Jena": "251848169",
  "Paradiespark Para - Jena Paradies": "228419929",
  "Ernst-Abbe-Sportfeld": "326665551",
  ...
}

In case you only need the location title or only the location ID, simply copy(loctitle) or copy(loc).

Now you can easily mine the respective posts with Fast Instagram Scraper, preprocess Instagram data or perform some analysis as we did for Instagreens Bonn! In case you have any hints for better large-scale Instagram location ID mining feel free to contact me!