This is a particularly big extension, just in time for Christmas. Pretty much all villages of any size are now covered, whether or not any administrative units were named after them.
Adding “places” is not about extending our content, but about linking together existing content and making it easier to find: associating locations, generally settlements, on our historical maps with administrative units named after them, and with entries about them in our descriptive gazetteers.
So what is this batch about?
Firstly, it means that almost all ecclesiastical parishes have been placed, the only exceptions now being within the City of London (and that is to avoid overloading the listing). This adds quite a few villages, many suburbs — and what look like “places” in the middle of nowhere, which are generally housing estates from the 1950s and 1960s which made it into Youngs’ listing of parishes:
Secondly, we have checked every descriptive gazetteer entry of over 200 characters length that was not already linked to a place, and if it was for any kind of settlement we have defined it as a place. That length limit is obviously arbitrary, but shorter entries generally give just a name and a location with only minimal descriptions, like ‘a hamlet’.
We have also defined a few new “places” which are not really populated places, but had interesting descriptive entries, such as:
One limitation to this is that all we hold as a location for each place is a point coordinate, so we can not really include features like railway lines or rivers. We do have some control over what map is displayed on each “place page”, so for districts like “Swaledale” we can specify a map showing a big area — but internally it is still a point:
As explained previously, as we link administrative units, descriptive gazetteer entries and references in travel writing together to define places, all the geographical names we harvest from those sources becomes names of the place. One result is that searching using any of those names will take you to the relevant place page.
However, there is another way we make use of all the names we hold for places, which is in ranking search results so that we list the most (hopefully) relevant place first: for example, if you search our system for “Newport”, grouping by “place” and counting the frequency of the names, you get:
select p.g_place, p.g_name, p.g_container, count(n.g_name) as freq from g_place p, g_name n where p.g_place=n.g_place and n.g_name='NEWPORT' group by p.g_place, p.g_name, p.g_container order by freq desc; g_place | g_name | g_container | freq -------+-----------------+-----------------+------ 630 | NEWPORT | SHROPSHIRE | 13 177 | NEWPORT | HAMPSHIRE | 12 1121 | NEWPORT | MONMOUTHSHIRE | 12 294 | NEWPORT PAGNELL | BUCKINGHAMSHIRE | 8 6839 | NEWPORT | ESSEX | 8 8390 | NEWPORT | PEMBROKESHIRE | 4 21030 | NEWPORT | DEVON | 4 13788 | WALLINGFEN | EAST RIDING | 4 21031 | NEWPORT | SOMERSET | 3 21029 | NEWPORT | CORNWALL | 3 17409 | NEWPORT ON TAY | FIFE | 3 25079 | NEWPORT | NORTH RIDING | 2
Almost exactly the same query lies behind our home page searching, it is just that there you do not see the count. The reason why we hold the name “Newport” thirteen times for the place in Shropshire is mostly because of all the administrative units named after it, but also because of a couple of references from descriptive gazetteers. You can see the details here:
We have actually made a small change to how the system works, so we are now including names from the descriptive gazetteers even where they match the “standard name”. This should make the relevance ranking work a little better.
We have now defined 21,443 places, and they have 127,073 names in the names table associated with them.
As always, there are a couple of related changes planned. Firstly, there are now an embarrassingly large number of names in the names table whose source is defined simply as “GBHGIS”. In most cases, these are in fact names that appear on the map extracts on our place pages, it is just that we currently have no mechanism for recording links between names and map images. Secondly, it is now time to revisit the travel writing collection, because a lot of additional places have been defined but references to these in the travel writers have not been harvested.