National Transit Gazetteer and Atlas Technical Documentation
Overview
This document provides information on the National Transit Gazetteer and Atlas, including project background, data sources and methodology, and additional context to help people use the platform with confidence. Additional discussion on the methodology used in the project is located on the Transit-Oriented Discoveries Blog
Project Timeframe
The National Transit Gazetteer and Atlas was developed between July 1-December 31, 2024. Transit agency and station data included in the platform represents fixed guideway modes (i.e. heavy rail, light rail, commuter rail, bus rapid transit, ferry, streetcar rail, monorail, incline plane, and ariel tramway) in operation as of September 30, 2024. We anticipate updating the platform in the fall of 2025 to incorporate stations that are newly opened or newly closed as of September 2025.
If you believe the platform is missing modes or stations, please contact connect@transitdiscoveries.com.
Data Elements and Sources
The project’s database relies on General Transit Feed Specification GTFS data for station locations and associated transit routes and data from the National Transit Database NTD Facility Inventory for station age, size and configurations. Data from Wikipedia articles and transit agency websites were used to confirm information from both sources and to selectively fill in gaps, when necessary. Data on a station’s the street address, city, county, and State was generated using Google Maps and reverse geocoding. The chart below summarizes the data used in the Gazetteer’s station summaries:
A Note About Bus Rapid Transit (BRT)
The Bus Rapid Transit systems included in the Gazetteer and Atlas are documented in submissions to the National Transit Map hosted by the Bureau of Transportation Statistics or the 2023 National Transit Database Facility Inventory. In a few instances, (such as the Madison Rapid Line A and the King County G Line, both of which opened in the fall of 2024) data that had not yet been reported to either of these databases was included in this platform. Some users of the Gazetteer may not find the BRT in their community included on the map. This is likely because of a lack of standardization on how the mode is defined, leading to inconsistencies in what qualifies as "BRT”. If you believe a BRT system is missing from the Gazetteer and Atlas, please contact connect@transitdiscoveries.com.
A Note about Station Construction Dates
As noted previously, the Gazetteer uses information from the NTD on the date that a station was constructed or reconstructed. Users may notice that some station summaries refer to a station “opening” at a much later date than is commonly understood, such as a Boston subway station that has been active since the early 20th century opening in 1984. This newer information may represent the date that a legacy station was renovated. Wikipedia entries about the station (if available) may provide additional information about a station’s history.
Use of Artificial Intelligence (AI)
This project uses OpenAI's GPT-3.5 model to craft a conversational, easy-to-understand summary of each station being queried. These summaries are not generated from scratch but are grounded in the GTFS and NTD data used in the project along with the Wikipedia entries generated with separate code. AI summaries may also include information about station usage or characteristics of transit agencies and surrounding areas not included in the GTFS and NTD datasets. Much of the information provided by the AI chatbot can be verified by examining the corresponding station area map, the Wikipedia sources generated, or the data in the database. This platform uses AI as an interpretive layer, not an independent source of information. If you have concerns that any of the information in the paragraphs is incorrect, please contact connect@transitdiscoveries.com.
Use of Wikipedia
The platform uses the Wikipedia Application Programming Interface (API) along with the name of the transit station selected, the name of the agency selected, and the term “transit station” to search Wikipedia for up to three relevant articles. The articles are incorporated into the AI prompt to provide additional context and returned alongside the generated summary for users to explore further.
Users may find that Wikipedia returns irrelevant articles for some station searches. Reasons for this may include non-unique station names (such as the “Convention Center Station” could exist in multiple transit systems) and the search algorithm’s limitations. As of January, 2025 the algorithm uses a basic keyword matching approach and provides results based on text similarity, not geographical relevance.
Future enhancements will incorporate the station's geographical coordinates into the Wikipedia search and will add more sophisticated filtering mechanism that cross-references the station's location with Wikipedia article metadata.
In addition, the platform is more likely to generate Wikipedia articles for larger stations in big cities, such as subway stations then for smaller stations such as bus rapid transit lines in smaller communities. Larger metropolitan areas tend to have more comprehensive Wikipedia coverage from volunteer contributors. Regions with more active local historians, transit enthusiasts, or Wikipedia editors will have more comprehensive articles. Stations with historical importance or unique architectural features are more likely to have dedicated Wikipedia pages
Use of OpenStreetMap
The platform connects with the Overpass API, which is part of the OpenStreetMap (OSM) ecosystem, to fetch nearby points of interest (POIs) dynamically. The query searches for amenities (such as public facilities), shops, and tourist attractions within 800 meters of the station longitude and latitude coordinates. The code the sorts the POIs by proximity and returns the closest three points of interest.
OSM points of interest are tagged in real-time so the three closest points of interest to a particular station today may not be the same a month from now.
Frequent Users of the National Transit Gazetteer and Atlas may notice that some stations are associated with somewhat trivial or quirky points of interest such as “a mailbox” or “a statue”. This is a direct result of how OpenStreetMap is crowd-sourced and how different contributors document their local environments. Some contributors are extremely detailed, mapping even very minor features like individual mailboxes, street furniture, or small landmarks while others focus on more major landmarks. There are very few strict rules about what can or cannot be mapped.
In addition, OSM tends to have a documentation bias where affluent areas tend to have more detailed POI documentation and regions with active tech communities or university campuses often have more comprehensive mapping.
Nevertheless, we hope you find the OSM points of interest valuable insofar as they add “local color” to station area summaries and enhance the information included in the GTFS or NTD data.
Use of Concentric Circle Overlays
The Gazetteer and Atlas incorporates allows users to visualize areas within a 200 meter, 400 meter, and 800 meter radius from the longitude and latitude coordinates of the transit stations. These areas were chosen consistent with common practice for transit-oriented analysis with a 200 meter radius representing the station and immediate surrounding area, an 800-meter radius incorporating the furthest distance that most people are willing to walk to a station and ¼ mile (400 meter) providing an intermediary threshold. Future enhancements may provide more sophisticated information about walking distance to a stion that takes into account local barriers to access (such as rivers or highways).
Some platform users may notice that the dots representing station areas, along with heir concentric circles, are not precisely aligned with the station location on the OpenStreetMap base layer. This typically has to do with how station location coordinates are captured in GTFS stops data versus how OSM volunteers chose to render the station location on the map. Typically these differences are less than 100 meters, however if users spot larger discrepancies, please contact connect@transitdiscoveries.com.
Future Enhancements
The National Transit Station Gazetteer and Atlas is a work in progress. Between January and April 2025, changes will be made in response to user feedback on completeness, accuracy and usability with updates provided to individuals who have subscribed to the Transit-Oriented Discoveries site. Future enhancements include adding demographic data from the U.S. Census and incorporating walksheds, land use and development density analysis, and analyzing roadway infrastructure and walkability.
Subscriptions and access to Raw Data and Software
The National Transit Station Gazetteer and Atlas is available free of charge until April 15, 2025 around which time additional information on a subscription service will be available. Subscription service will include access to a csv file of the GTFS and NTD data used in the platform. We anticipate that a simplified version of the Gazetteer and Atlas will be available for free. Software developed for the National Transit Station Gazetteer and Atlas is proprietary and not available for sharing at this time.