Hacker News new | past | comments | ask | show | jobs | submit login
Songdata (songdata.io)
86 points by bekind on Jan 7, 2022 | hide | past | favorite | 36 comments



I've been using this site ever since I discovered it! I'm glad to see it featured on HN.

I started learning to play the Romanian Kaval [1] last summer and Songdata helped me enormously in finding good backing tracks to improvise on.

I'm mostly improvising on chillhop tracks right now because I'm trying to develop less traditional rhythms (although I'm still clumsy with the breathing and knowing when to keep the silence): https://soundcloud.com/alin-panaitiu-978520831/improv-1-on-m...

I've also developed Noiseblend (https://noiseblend.com) which provides a way to choose a song key when discovering by artists [2] [3] but I still find myself plugging a good playlist into Songdata and just choosing the A minor tracks.

[1] https://www.youtube.com/watch?v=QBElWgzJv8M

[2] https://www.noiseblend.com/playlist?artists=3U6eCXHFS6wQVuFu...

[3] https://cln.sh/SmtVf8


I just searched for "yer blues" by The Beatles. Different mixes of the same exact studio recording have different keys and BPMs listed. I could perhaps understand if it gets the demo versions wrong, but when subtly different remasters/mixes of the same exact source material result in wildly different output, I don't see how this data can be trusted at all. Was that just one freak result out of a million correct ones...?


The data comes from Spotify’s Track Audio Features API. [1] [2]

I believe this data was imported from the EchoNest project when Spotify acquired them [3] and it’s possible that algorithms have improved since 2014 but not all old tracks have been re-analyzed. I would guess that this is what’s causing the discrepancies you’re seeing.

[1] https://developer.spotify.com/console/get-audio-features-tra...

[2] https://cln.sh/TfGPcj

[3] https://en.wikipedia.org/wiki/The_Echo_Nest


MusicBrainz has been around for more than two decades and is the gold standard for this kind of data (and more):

https://musicbrainz.org/


For key and BPM data, the MetaBrainz foundation also has AcousticBrainz: https://acousticbrainz.org/ (I'm a developer on this project). Unfortunately, I would say that the data that we have in AcousticBrainz isn't as good as what's in the Spotify API, although the algorithms that we use are completely free (available in the Essentia signal processing library - https://essentia.upf.edu/). Over the last few years the algorithms in Essentia have improved and we're hoping to release a new version of the tool used in AcousticBrainz to improve the database.


Adding on, for anyone looking to fetch MusicBrainz data automatically for a big library, this is the gold standard tool to do so:

https://beets.readthedocs.io/en/stable/


I hadnt heard of the camelot song feature before. Seems its a code to indicate the compatible key: https://www.quora.com/What-does-Camelot-mean-in-musical-term...


"Camelot wheel" is a misnomer for what's actually known as the circle of fifths. Either way, it's a way of quickly spotting "closely related" keys (e.g. C and G are closely related because G is the dominant tone wrt. C) and relative minors/majors (A minor is a relative key to C major, because it uses the same notes with a different tonal center and assignment of scale degrees).


I see the data is coming from Spotify, my experience from trying to use their api was that use cases are quite restrictive and you had to request access (they would let us do what we wanted).

If the developer is here, what is the agreement you have with Spotify? Do you have any plans of offering an API yourselves?

I would be interested in a paid api that gave us access to the data available though the Spotify api, they don’t offer one.


You can find the Spotify Developer Terms here: https://developer.spotify.com/terms

Basically, you're not allowed to sell the Spotify in any way.


In those terms, it says "you may not store, aggregate or create compilations or databases of Spotify Content". So for this site to be compliant, they should be serving up the data directly from the Spotify API, which ought to get rate limited pretty quickly (when on the HN frontpage and all), unless they somehow got a very large quota.


I’m serving everything from the Spotify API directly on Noiseblend (https://noiseblend.com) and rate limit has never been a problem.

When a request fails because of a rate limit, Spotify responds with 429 and a Retry-After header so you know when to schedule the next request.

In my tests, that header never had a value greater than 10 seconds, and 429 responses were very rare.


Right, but you are using user login, and I think that gives you a lot more rate than a single account doing all the requests, which is what you need to get the no-sign-in experience on songdata.io - right?


Indeed, I forgot about that part...

I think all of my requests to Spotify have been done on behalf of authenticated users. But I guess you could authenticate with your own user to lift up the limits.


Spotify rate limits are per app, not per user.


Some time ago I wanted to see how many songs have bass notes that go bellow 50Hz and I quickly realised that it way over my expertise to do that, but I'm still curios: would it be possible to display the frequencies for a particular track?


What you seem to be looking for here is a Fourier transform/spectrogram. Vast majority of audio processing toolkits should have prebuilt tools along those lines for you to use.


You should be able to get that from the Spotify API endpoint that returns a track's audio analysis.


I looked into that but there was no information on frequencies.

There’s pitch but that only cares about in which key the frequencies are concentrated (e.g. it tells you that this part of the song has mostly C pitches but not if it’s C2 or C4)


Shameless plug: I run https://volt.fm which also provides this data (and a bit more).


The moment I saw on the front page that Adele's "Easy On Me" was listed at 142 BPM, I was flabbergasted.

You only have to play the song while looking at a timer for about 15 seconds to see the BPM is far less than half that number.


It seems they just don't understand time signatures.

Andy Williams "It's the Most Wonderful Time of the Year" is listed as 202 bpm. In some sense, this is correct. The song is in 6/8 time and if you count each of the six beats separately, there are 202 bpm, but this isn't how 6/8 works. Typically you would set the bpm based on a dotted quarter (= 3/8ths) instead of a quarter.

From a listener's perspective, 6/8 basically just means each beat is divided into 3 parts instead of 2. So in a "normal" 4/4 song, you'll hear a down-up, down-up repetition like KICK-up-SNARE-up-KICK-up-SNARE-up, with kick drums on downbeats of 1 and 3 and snare on downbeats of 2 and 4 with one upbeat after each. With 6/8, you'll hear KICK-up-up-SNARE-up-up basically.

(Note that this is distinct from 3/4 [waltz], where you'll hear KICK-up-SNARE-up-SNARE-up, but 3/4 is exceeding rare in popular modern music.)


BPM reduces songs to something that only makes sense in a limited context, like beatmixing (ex-DJ here) - are there other situations where BPM is helpful?


The stated purpose from the site is to be able to match songs on a playlist, so like keep a consistent bpm across an entire playlist. If that was your goal, you'd presumably want the dotted-quarter bpm for 6/8 instead of the eighth. For a listener, the feel of the song is going to be based on when the kicks and snares come, not how things are divided up within those beats. That's what I was trying to get at, but man it's hard to put a description of time signature into plain text.


> but 3/4 is exceeding rare in popular modern music.

So true, which is a shame, since 3/4 songs such as "Suspended in Gaffa" by Kate Bush or "All in All (This One Last Wild Waltz)" by Dexy's Midnight Runners are great to listen to and stick with you. Not to mention "Golden Brown" by The Stranglers - although that one is apparently in 12/8 and 13/8 for the riff [0].

My favourite modern waltz would have to be "Paristocrats" by Gonzales [1].

[0] https://www.goldradiouk.com/news/music/stranglers-golden-bro...

[1] https://www.youtube.com/watch?v=cNVzS3p8KKU


The Audio Analysis API also reports a tempo_confidence and there’s a 43.2% confidence on that Adele song.

    {
      "meta": {
        "analyzer_version": "4.0.0",
        "platform": "Linux",
        "detailed_status": "OK",
        "status_code": 0,
        "timestamp": 1633650625,
        "analysis_time": 7.80526,
        "input_process": "libvorbisfile L+R 44100->22050"
      },
      "track": {
        "num_samples": 4954520,
        "duration": 224.69478,
        "sample_md5": "",
        "offset_seconds": 0,
        "window_seconds": 0,
        "analysis_sample_rate": 22050,
        "analysis_channels": 1,
        "end_of_fade_in": 0.26807,
        "start_of_fade_out": 209.58913,
        "loudness": -7.519,
        "tempo": 141.981,
        "tempo_confidence": 0.432,
        "time_signature": 4,
        "time_signature_confidence": 1,
        "key": 5,
        "key_confidence": 0.545,
        "mode": 1,
        "mode_confidence": 0.58,

        ...

    }
[1] https://developer.spotify.com/console/get-audio-analysis-tra...


This is actually still a difficult task and still under active research. The name of the phenomenon is "tempo octave error". Typically an algorithm looks for evenly-spaced strong pulses of energy, and infers the BPM from that. If there is a strong beat at multiple of the actual BPM (half, double, 4x, etc) then it could be mistakenly identified as the BPM. As alin23 points out in a sibling comment it seems like the Spotify algorithm at least has a confidence level here. There is some more information about BPM computation and octave errors at https://www.audiolabs-erlangen.de/resources/MIR/FMP/C6/C6S2_...


To clarify, I'm coming at BPM strictly from a perspective of segueing from one danceable tune to another. If you go directly from a 140 BPM song to "Easy On Me", you are guaranteed to clear the dance floor, except for maybe a few slow-dancing couples.


I've been using the aptly named https://songbpm.com for about 10 years, which seems to have heavily inspired this site.


How well does this manage songs with key changes?


The data comes from Spotify, so I guess we should ask them :)

But in my tests, the pitches seem to be averaged so only the dominant key is reported.


super cool. would love a "has lyrics" feature (for my player, https://avant.fm). Always wonder how hard that would be.


Do you mean if the song has a vocalist singing? At the Music Technology Group (https://www.upf.edu/web/mtg) we have some classifiers that do something similar there are some demos at https://replicate.com/mtg/music-classifiers, see the output of the "voice_instrumental" classifier. This is built with a dataset of examples of instrumental songs and songs with singing in them. Normally classifiers are trained on timbre of the song to identify this. Earlier versions of this classifier are also available at AcousticBrainz: https://acousticbrainz.org/


thanks for pointing me to that. avant is powered by musicbrainz actually - so I will look to see what I can pull from acousticbrainz!


I love avant.fm! Do you know about musixmatch? They have an API for time-synced lyrics and is the source of that data for a bunch of large services.


thank you!! that's very nice to hear. and I will check out musixmatch for sure.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: