DHL is striving to make its parcel delivery business as efficient and as effective as possible. (See main story.) One thing they’re working on is to differentiate between deliveries to business, as opposed to residence, addresses.
Deliveries to businesses, obviously, are best made during normal business hours. The opposite is true of residential deliveries. The quest is to sort the parcels that fall into these two categories at the beginning of the process and assign them to separate routing systems, something it has not done before. In both cases, the company is striving to deliver the parcels at times that are best for each consignee.
A growing number of parcels carried by DHL involve crossborder movements, a fact which complicates matters on several levels.
“We operate in 220 countries,” noted Andre Wittfoth, a development manager at DHL. “This involves different language sets and different ways of denoting addresses.”
As is the trend throughout the logistics and supply chain industries, DHL is interested in applying the highest degree of automation to this process. This involves the application big data technologies, including cross-referencing the names and addresses provided by shippers to databases of postal codes, personal names, and business names and locations.
“We wanted to find a method to automatically calculate residential versus business addresses,” said Wittfoth.
The company, in collaboration with technology provider Teradata, decided to pilot this effort with a baptism by fire in Ireland. Why Ireland? Because that country doesn’t have a zip code system, so the effort would be solely dependent on the analysis of data found in the natural language information supplied by shippers.
DHL had five years of data on about 900,000 deliveries made in Ireland. This included information about when an individual was not home or a business was closed when a delivery was attempted. The data also included 400,000 personal and family names, access to a Google database of business addresses, and 500,000 customer tokens—textual information that supply clues about the nature of the consignee.
“The main success criteria was to generate a minimum of false positives,” said Jonas Svaton, a data scientist at Teradata, who worked on the project.
The strategy was to develop a model to use the data to make the residence-business distinction and then assemble a data set with known results to test the model. After that, the process moved to applying the model to live data without known results. The model analyzed the information given and provided a score as to whether the address in question was more likely a residence or a business.
A field test of the process showed the data model provided 20% better predictions than manual processes, with 80 percent fewer false positives. Next steps include testing more data, piloting the process in two other countries, and then rolling it out to 37 countries in the DHL network.