What this affiliate wants in a ticket datafeed/api – besides clean data

I was asked recently to spell out what ticketing agencies and companies can do to help make our job easier for integrations, and after smiling (since nobody ever really ASKS how they can help), I started rambling on and on about data formats and relationships.

 

This was originally going to be a closed email, but instead I figure it will make a better blog post.

If you’re an event data aggregator, programmer, or just like to have an opinion, please make sure you read everything before posting comments. That said, if you feel I missed something or if YOU have a better idea, we’re always open to suggestions.

 

The Problem is always a lack of good data.

Every site and service we integrate with has certain oddities lurking in their data. It could be venues with missing or incorrect addresses, artists with incorrect names, or multiple different artist IDs representing the same act (I’m looking at you Ticketmaster/Livenation). These inconsistencies are oftentimes easy to see, but many times become apparent only after the integration is completed and are sometimes very subtle and hard to track down.

The first thing I’d ask for in a data feed or API from a ticketing partner is consistently good data. Don’t provide me with complete data on 99% of items and sometimes blank out a critical field of data, just give us the same stuff all the time and we will be very happy campers.

Your Data is related, so make that obvious

When examining some data sources we find an easy to understand pattern:

Venues have Events, Events have one or more Artists and they begin and end at set times.

Events may include tickets, and tickets may include pricing data, availability data, urls to purchase tickets etc.

 

What’s truly baffling is that sometimes there will be missing pieces from the data which breaks relationships. A good event feed will be complete, and accurate, ideally we should be able to map your data to our internal structures pretty easily.

Which brings me to my next point: Have a simple, clear, and flexible structure for things.

 

Here’s what would work best if the data was coming through in a JSON format. Much of this would also apply to XML or CSV, but CSV data has some limitations and so we won’t be discussing it in great detail, plus I can’t stand XML so we’ll leave that alone as well.

The Example (almost complete)

event_id -> a unique identifier that corresponds to a specific event happening at a venue with a start and end date.

event_name -> This is not the same as the artist name, unless that’s really the title of the show. Madonna tours constantly, sometimes under a tour name. WWE events are “Raw, Wrestlemania, etc…” UFC events are all numbered, but are all part of UFC. The event name is that name of the event, not the artist name.

event_status -> tell me something about this event, is it cancelled? postponed? happening now? already on sale? not on sale yet? at the very least you should give me a clue if i should be removing the event from our system (cancelled shows) or if everything is on track and on schedule. What gets included and the format is up to each agency, most have a flag for cancelled shows, some go much further.

event_start_time -> When does the event start, if you want to give me a bonus, show me what time the doors open in a secondary field.

event_start_time -> If known, when does the event end? For some resident shows (Broadway, London, Las Vegas, Branson, etc..) these are well established and every show ends within a few seconds. For rock concerts on tour this might be hit or miss, and isn’t truly essential in my opinion, but it is nice to have. We don’t use this particular field, but I can think of several kinds of site that could use it in some interesting ways.

venue -> information about the venue where the event is taking place goes here. Including a JSON object with the information about the venue is perfect. Key information about the venue includes things like:

venue_id -> an identifier that is unique to the venue.

venue_name -> Please adopt a consistent format, I’ve seen some feeds where half the venues have Proper Names while the other half are ALL CAPS.

venue_latitude -> Yes, we CAN look this up from the address, but honestly it’s a bit of a pain and since you’ve got the data, just send it along so we can get new venues that you add to your system into ours with a minimum of fuss.

venue_longitude -> Second part of location information. Since local is a powerful feature and we organize our system based on the distance from the center of a particular city, knowing where things are is important.

venue_address

venue_city

venue_state

venue_country

venue_zip

venue_timezone -> You’d  think this would be easily guessable based on the state.. you’d be wrong, there are edge cases that are surprisingly annoying.

venue_DMA -> optional, nice to have if we’re going to use it to match up against a mailing or email list, but not critical for what we do at BoxOfficeHero.

venue_notes -> Here you can include things that might be useful to tell our readers.. things like “No elevator to balcony” might be useful. You can include HTML, or not.. but if you choose one or the other please be consistent in the format.

venue_image_large -> A url to a large image that shows the venue or its logo. Remember to allow “hotlinking” of this image. You don’t need to know what that means, your IT/Web guys will figure it out.

venue_image_small -> A url to a small image that shows the venue or its logo. Again, something we can hotlink to would be ideal.

venue_status -> Not really a critical field, but if an agency stops representing a venue, or a venue is closed due to a fire or renovations it might be nice to know, if we ever get this kind of data we could alert our members to the closure or disruption.

event_url -> A url I can send visitors to that will allow them to buy tickets. Depending on your systems this could already be an affiliate link, or it might need to be transformed by our system to become an affiliate link. We don’t care, just as link as the link is to the actual event and not to your home-page. Deep linking is the only linking we do, so please help us to help you by telling us where to send traffic.

event_type -> We bucket things into the same major categories that Ticketmaster does: Concerts, Sports, Arts/Theater, Family and Other. Celine Dion is a Concert, unless she is playing football, in which case it would qualify as Sports. If you really have no idea, put it in the Other category and we’ll probably just ignore it when be bring the feed into our system. You can use an integer value for this field with numbers, or a text field with a word, it’s up to you.

event_genre -> Here’s your chance to say this is a “Comedy Concert” a “Musical”, or “Rock and Roll” – some artists have multiple styles, and if you know the format of the show, telling us makes it easier for us to show it to the right audience.

artists -> Here is where we want to include data on the performers who are appearing at the event. In many cases this will be a single artist, but in some cases there are multiple acts performing. Give us as much accurate data as you can.

artist_id -> Are you noticing a theme yet? If your system has a de-facto identifier for an artist, please use it. We’ll use it to match the artist in your system to the artist in our system, creating more listings on a single page for events with that artist appearing. This makes our life easier, it makes your listings appear more clearly in our system, and helps us send the right events to our members.

artist_name -> Obvious

artist_image_large -> see above venue_image_large

artist_image_small -> see above venue_image_small

artist_genres -> if you want or can, including information on the major genre for an artist can help us match the event to preferences shared by our users. Whatever you can give us, just make sure it’s documented somewhere and that there is an “other” option for outliers.

artist_twitter -> If you’ve got it, including the twitter handle of the artist helps us do some clever tricks.

artist_homepage -> if the artist has an official site, this always helps too.

tickets -> ticket information could be as simple as a price and currency, or could include information such as onsale and presale dates, current availability of tickets, price ranges, or any combination thereof. Whatever data you can provide will help us give our members more information and better choices. At a minimum, including the date tickets go on sale, and the price range would be fine, including a flag for onsale vs presale tickets would also be excellent.

deals/offers -> If an event has deals, discounts, a bundle of 4 tickets at a reduced rate, etc.. this is the place to describe it. We will integrate things so that the deal(s) appear as ticket blocks on the event page and can help boost sales of distressed inventory. If you simply lower the price of tickets in the tickets data element our system will NOT pick this up as a deal.

Accessing Data

SOAP, REST, FTP, HTTP, HTTPS.. Argh!@!!!!

There are loads of ways you can “protect” your data, but really, consider that anyone using it is probably trying to make money by linking through your affiliate program, consider a wide-open api that requires a simple authentication token.

If all I have to do to request a list of all your venues is to request something like this:

www.yoursite.com/api/venues_list?token=12345XYZ

Thats great.

If you’re going to make me send weird http headers, store a cookie and send that along, and POST data for no good reason at all – WE can do that, but there are many other affiliates and sites that have a much harder time, and digging through piles of documentation on HOW to make a request is .. painful.

You can mitigate this by providing wrapper classes for PHP, Ruby, Java, Javascript etc. to allow programmers to easily connect to your system, and if you can make technical resources available to assist via a public forum or support system that will also help your data flow outwards into more and more systems and sites.

Summary: REST is the way you should do it unless you believe that SOAP is the answer. I don’t know what will work for your other partners, REST with JSON coming back is great for us.

 

Return Fields – everything, or bits and pieces?

Some ticketing systems have so much data that the transfer from server to server acts as a bottleneck. Fortunately there is an easy option: Don’t send it all back. If we can specify the fields we want and limit to only those fields, we can make the load on your system lighter, reduce your bandwidth, and everyone’s costs. This is pretty easy to do from a technical standpoint, so consider putting it in the spec.

Example:

www.yoursite.com/api/get_all_venues?token=12345XYZ?fields=venue_id,venue_name

Would return a list of venue_id’s and names.. we would look for any that are new, and ask your system for them individually like this:

www.yoursite.com/api/venue_detail?token=12345XYZ&venue_id=7495

Of course what endpoints you build and how you structure them is entirely your business. We’re one client, and it’s very likely we won’t be your biggest one. Still, giving us as much data as possible and flexibility in how we consume it is the important thing.

 

Should requests be “signed”?

Listen, I’m not here to tell you how secure or insecure you should build your systems, but forcing developers to deal with “Why won’t the request signing work” is a stupid waste of time. I’ve had 3 people pore over code that was taken directly from a vendor’s API example page, only to find a blog post somewhere else online that describes a problem with character encoding in multibyte strings and signing code that we were being required to use. Yeah, not fun. Makes me want to scream.

Especially if there is nobody answering emails with requests for assistance, and if the documentation is wrong…

I’m making a request for data, signing should not be required. If some other sites want to be able to create orders, reserve tickets, or make CHANGES to your inventory, yes, by all means implement signing and any other crazy security measures you can think of – those partners should probably have their own discreet endpoint for those things, perhaps THAT endpoint should require signing, and IP whitelisting too.

 

So what kind of requests will you make?

Ideally we want to be able to pull the following kinds of data:

A complete list of venues you sell tickets for.

A complete list of events happening AT a specific venue.

Ticket Pricing options and onsale/presale/deal information  for a given event.

A firehose of the last 100 to 10,000 events added to your system would be nice, but is not essential (if we only have to pull down one file and can loop through the structure you’re saving a ton of HTTP requests and making everyone’s lives easier, but only if you’re going to return complete data structures in that response. Ask your techies what that means, they should understand.

Anything else?

Sure – If something is non-commissionable in your affiliate program, putting that in the data would be nice – we’ll certainly use it anyways, but others might not.

Reporting API errors with authentication or request format in a reasonably nice way would help – “ERROR 4387” doesn’t really help me, or anyone else, but a well written “You are missing the token parameter” is much clearer and takes just as long to write into your code.

If you have promotional videos for a show, those might be fun to include, if you have an official site for an event or an artist, that’s nice to have, reviews or ratings for an artist or show could help.. Basically ANY data you can include might inspire us to build something clever with it. It might not, so don’t go way out of your way, but if it’s there in your database anyways try your best to expose it via the api.

 

Preferred Output Format

It’s a matter of personal preference, but my view is that XML is almost useless for consuming this kind of thing and needs to go far, far away to die in a hole. This is a personal opinion, mostly borne out of frustration with bad WSDLs.

When you add to that the complexities of namespaces, validation and the additional overhead of all the encapsulation means that in many cases XML feeds are much larger than flat text files, while providing limited additional benefits when you compare them to the best solution which in my humble opinion is..

JSON

The BEST way to provide the data in our opinion is a collection of JSON objects. These are easy to work with in browser or using any backend web language and allow for the exchange of complex data structures in a very easy to consume format. If you want to help us, and everyone else who has a site that is similar, make available JSON endpoints. Just tell that to the technical people, they will understand what that means, everyone will thank you and you’ll end up looking like a hero.

If JSON is off the table (and I’m always confused as to why it would be), just make things easy and export events out to a CSV file.

CSV is very easy to consume, but can lead to interesting problems when it comes to multiple artists appearing at the same event, how does one send that along in an efficient way? You could create a supporting artist column, but if that’s the case you are just producing limitations. If you want to get creative and export a 2nd file with all the Artists in it, matching can be done in the backend of our system, and while this is fine it means we cannot process individual events without processing the complete dataset at the same time which can get expensive.

Conclusion: CSV works well enough for simple events, but for more rich data JSON is the way to go – if you’d like extra Karma points, allow us to request an output format when making the request so we can process JSON, CSV or XML.

 

One last thought

Try to put yourself into the shoes of your affiliates – we need data on what’s happening, where it’s happening, and content we can use to present it to our audience. The more you can give us, the better, and the more you can listen to our feedback and respond the happier we will be.

Strongly consider setting up a forum or group somewhere for API users to submit their feedback and ask for help – and please don’t get defensive about your data and how we use it – at the end of the day we’re trying to help users find and attend your events, anything we do that furthers that goal should be allowed and encouraged. If you have restrictions, please make them clear and don’t go tightening them later if it can be avoided.

So, did I miss anything?

 

These are the opinions of Eric Schwarzer, our lead developer and while we love him dearly, he does say some strong things sometimes. please don’t get offended, he will also admit when he’s wrong and is eagerly looking forward to defending himself in the comments area below.

Speak Your Mind

*