References¶
- AAB+15
Mart\'ın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: large-scale machine learning on heterogeneous systems. 2015. Software available from tensorflow.org. URL: https://www.tensorflow.org/.
- ATHJ21
Manuel Anglada-Tort, Peter MC Harrison, and Nori Jacoby. Repp: a robust cross-platform solution for online sensorimotor synchronization experiments. bioRxiv, 2021.
- BKK18
Shaojie Bai, J. Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. CoRR, 2018. URL: http://arxiv.org/abs/1803.01271, arXiv:1803.01271.
- Bot91
Léon Bottou. Stochastic gradient learning in neural networks. Proceedings of Neuro-Nımes, 91(8):12, 1991.
- BockKW14a
S. Böck, F. Krebs, and G. Widmer. A Multi-model Approach to Beat Tracking Considering Heterogeneous Music Styles. In 15th Conf. of the Int. Soc. for Music Information Retrieval (ISMIR 2014), 603–608. Taipei, Taiwan, October 2014.
- BockD20
Sebastian Böck and Matthew EP Davies. Deconstruct, analyse, reconstruct: how to improve tempo, beat, and downbeat estimation. Proc. of ISMIR (International Society for Music Information Retrieval). Montreal, Canada, pages 574–582, 2020.
- BockDK19
Sebastian Böck, Matthew EP Davies, and Peter Knees. Multi-task learning of tempo and beat: learning one to improve the other. In ISMIR, 486–493. 2019.
- BockKW14b
Sebastian Böck, Florian Krebs, and Gerhard Widmer. A multi-model approach to beat tracking considering heterogeneous music styles. In Proc. of the 15th Intl. Society for Music Information Retrieval Conf. (ISMIR), 603–608. Taiwan, Tapei, 2014.
- BockS11
Sebastian Böck and Markus Schedl. Enhanced beat tracking with context-aware neural networks. In Proc. Int. Conf. Digital Audio Effects, 135–139. 2011.
- BockKW16
S. Böck, F. Krebs, and G. Widmer. Joint beat and downbeat tracking with recurrent neural networks. In 17th International Society for Music Information Retrieval Conference (ISMIR). 2016.
- CFG18
Tian Cheng, Satoru Fukayama, and Masataka Goto. Convolving gaussian kernels for rnn-based beat tracking. In 2018 26th European Signal Processing Conference (EUSIPCO), 1905–1909. IEEE, 2018.
- CVMerrienboerG+14
K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
- CFCS17
Keunwoo Choi, György Fazekas, Kyunghyun Cho, and Mark Sandler. A tutorial on deep learning for music information retrieval. arXiv preprint arXiv:1709.04396, 2017.
- C+15
- DDP09
Matthew EP Davies, Norberto Degara, and Mark D Plumbley. Evaluation methods for musical audio beat tracking algorithms. Queen Mary University of London, Centre for Digital Music, Tech. Rep. C4DM-TR-09-06, 2009.
- DRuaP+12
Norberto Degara, Enrique Argones Rúa, Antonio Pena, Soledad Torres-Guijarro, Matthew EP Davies, and Mark D Plumbley. Reliability-informed beat tracking of musical signals. IEEE Transactions on Audio, Speech, and Language Processing, 20(1):290–301, 2012.
- DSdHMuller19
Jonathan Driedger, Hendrik Schreiber, W. Bas de Haas, and Meinard Müller. Towards automatically correcting tapped beat annotations for music recordings. In Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR). Delft, The Netherlands, November 2019.
- DBDR15
S. Durand, J. P. Bello, B. David, and G. Richard. Downbeat tracking with multiple features and deep neural networks. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), volume. 2015. doi:10.1109/ICASSP.2015.7178001.
- DBDR16
S. Durand, J. P. Bello, B. David, and G. Richard. Feature adapted convolutional neural networks for downbeat tracking. In IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP). 2016.
- DBDR17
S. Durand, J. P. Bello, B. David, and G. Richard. Robust Downbeat Tracking Using an Ensemble of Convolutional Networks. IEEE/ACM Trans. on Audio, Speech, and Language Processing, 25(1):76–89, January 2017.
- DE16
S. Durand and S. Essid. Downbeat Detection With Conditional Random Fields And Deep Learned Features. In 17th Int. Soc. for Music Information Retrieval Conf. (ISMIR 2016), 386–392. New York, USA, August 2016.
- DDR14
Simon Durand, Bertrand David, and Gaël Richard. Enhancing downbeat detection when facing different music styles. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3132–3136. IEEE, 2014.
- Ell07
Daniel PW Ellis. Beat tracking by dynamic programming. Journal of New Music Research, 36(1):51–60, 2007.
- FJDE15
T. Fillon, C. Joder, S. Durand, and S. Essid. A conditional random field system for beat tracking. In IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 424–428. South Brisbane, Australia, April 2015.
- FMR+19
M. Fuentes, L. S. Maia, M. Rocamora, L. W. P. Biscainho, H. C. Crayencour, S. Essid, and J. P. Bello. Tracking beats and microtiming in afro-latin american music using conditional random fields and deep learning. In 20th International Society for Music Information Retrieval Conference, ISMIR. 2019.
- FMC+19
M. Fuentes, B. McFee, H.C. Crayencour, S. Essid, and J.P. Bello. A music structure informed downbeat tracking system using skip-chain conditional random fields and deep learning. In 44th Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 481–485. Brighton, UK, May 2019.
- FMC+18
Magdalena Fuentes, Brian McFee, Hélène Crayencour, Slim Essid, and Juan Bello. Analysis of common design choices in deep learning systems for downbeat tracking. In The 19th International Society for Music Information Retrieval Conference. 2018.
- GBC16
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.
- Got01
Masataka Goto. An audio-based real-time beat tracking system for music with or without drum-sounds. Journal of New Music Research, 30(2):159–171, 2001.
- GTH19
Alexander Greaves-Tunnell and Zaid Harchaoui. A statistical investigation of long memory in language and music. arXiv preprint arXiv:1904.03834, 2019.
- GSKoutnik+16
Klaus Greff, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. Lstm: a search space odyssey. IEEE transactions on neural networks and learning systems, 28(10):2222–2232, 2016.
- HM04
Stephen W Hainsworth and Malcolm D Macleod. Particle filtering applied to musical tempo tracking. EURASIP Journal on Advances in Signal Processing, 2004(15):927847, 2004.
- HZCPerpinan04
Xuming He, Richard S Zemel, and Miguel Á Carreira-Perpiñán. Multiscale conditional random fields for image labeling. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., volume 2, II–II. IEEE, 2004.
- HS97
Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- HDF12
Jason Hockman, Matthew EP Davies, and Ichiro Fujinaga. One in the jungle: downbeat detection in hardcore, jungle, and drum and bass. In ISMIR, 169–174. 2012.
- HKS14
A. Holzapfel, F. Krebs, and A. Srinivasamurthy. Tracking the 'odd': meter inference in a culturally diverse music corpus. In 15th Int. Society for Music Information Retrieval Conf. (ISMIR), 425–430. Taipei, Taiwan, October 2014.
- HG16
Andre Holzapfel and Thomas Grill. Bayesian meter tracking on learned signal representations. In ISMIR-International Conference on Music Information Retrieval, 262–268. ISMIR, 2016.
- HDZ+12
André Holzapfel, Matthew E. P. Davies, José R. Zapata, João Lobato Oliveira, and Fabien Gouyon. Selective sampling for beat tracking evaluation. IEEE Transactions on Audio, Speech, and Language Processing, 20(9):2539–2548, 2012. doi:10.1109/TASL.2012.2205244.
- IS15
S. Ioffe and C. Szegedy. Batch normalization: accelerating deep network training by reducing internal covariate shift. In 32nd International Conference on Machine Learning (ICML). 2015.
- JLL19
Bijue Jia, Jiancheng Lv, and Dayiheng Liu. Deep learning-based automatic downbeat tracking: a brief review. Multimedia Systems, pages 1–22, 2019.
- JZS15
Rafal Jozefowicz, Wojciech Zaremba, and Ilya Sutskever. An empirical exploration of recurrent network architectures. In International Conference on Machine Learning, 2342–2350. 2015.
- KFRO12
Maksim Khadkevich, Thomas Fillon, Gaël Richard, and Maurizio Omologo. A probabilistic approach to simultaneous extraction of beats and downbeats. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 445–448. IEEE, 2012.
- KB14
D. Kingma and J. Ba. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- KEA06
A.P. Klapuri, A.J. Eronen, and J.T. Astola. Analysis of the meter of acoustic musical signals. IEEE Transactions on Audio, Speech, and Language Processing, 14(1):342–355, 2006. doi:10.1109/TSA.2005.854090.
- KFH+15
Peter Knees, Angel Faraldo, Perfecto Herrera, Richard Vogl, Sebastian Böck, Florian Hörschläger, and Mickael Le Goff. Two data sets for tempo estimation and key detection in electronic dance music annotated from user corrections. In Proc. of the 16th Intl. Society for Music Information Retrieval Conf. (ISMIR), 364–370. 2015.
- KBockW14
Filip Korzeniowski, Sebastian Böck, and Gerhard Widmer. Probabilistic extraction of beat positions from a beat activation function. In ISMIR, 513–518. 2014.
- KBockW11
F. Krebs, S. Böck, and G. Widmer. An efficient state space model for joint tempo and meter tracking. In 16th International Society for Music Information Retrieval Conference (ISMIR). 2011.
- KBockDW16
F. Krebs, S. Böck, M. Dorfer, and G. Widmer. Downbeat tracking using beat synchronous features with recurrent neural networks. In 17th International Society for Music Information Retrieval Conference (ISMIR). 2016.
- KBockW13
Florian Krebs, Sebastian Böck, and Gerhard Widmer. Rhythmic pattern modeling for beat and downbeat tracking in musical audio. In ISMIR, 227–232. 2013.
- KHCW15
Florian Krebs, Andre Holzapfel, Ali Taylan Cemgil, and Gerhard Widmer. Inferring metrical structure in music using particle filters. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(5):817–827, 2015.
- KH04
Sanjiv Kumar and Martial Hebert. Discriminative fields for modeling spatial dependencies in natural images. In Advances in neural information processing systems, 1531–1538. 2004.
- Lon04
J. London. Hearing in Time: Psychological Aspects of Musical Meter. Oxford University Press, New York, USA, 2004.
- MBock19
EP MatthewDavies and Sebastian Böck. Temporal convolutional networks for musical audio beat tracking. In 2019 27th European Signal Processing Conference (EUSIPCO), 1–5. IEEE, 2019.
- McF18
Brian McFee. Statistical Methods for Scene and Event Classification, pages 103–146. Springer International Publishing, Cham, 2018. URL: https://doi.org/10.1007/978-3-319-63450-0_5, doi:10.1007/978-3-319-63450-0_5.
- MB17
Brian McFee and Juan P. Bello. Structured training for large-vocabulary chord recognition. In 18th International Society for Music Information Retrieval Conference, ISMIR. 2017.
- MR02
missing journal in murphy2002dynamic
- NH10
Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), 807–814. 2010.
- NRJB15
L. Nunes, M. Rocamora, L. Jure, and L. W. P. Biscainho. Beat and downbeat tracking based on rhythmic patterns applied to the uruguayan candombe drumming. In 16th Int. Soc. for Music Information Retrieval Conf. (ISMIR), 264–270. Málaga, Spain, October 2015.
- PT16
Helene Papadopoulos and George Tzanetakis. Models for music analysis from a markov logic networks perspective. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(1):19–34, 2016.
- PP10a
Hélene Papadopoulos and Geoffroy Peeters. Joint estimation of chords and downbeats from an audio signal. IEEE Transactions on Audio, Speech, and Language Processing, 19(1):138–152, 2010.
- PHV16
Giambattista Parascandolo, Heikki Huttunen, and Tuomas Virtanen. Recurrent neural networks for polyphonic sound event detection in real life recordings. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6440–6444. IEEE, 2016.
- PP10b
Geoffroy Peeters and Helene Papadopoulos. Simultaneous beat and downbeat-tracking using a probabilistic framework: theory and large-scale evaluation. IEEE Transactions on Audio, Speech, and Language Processing, 19(6):1754–1769, 2010.
- PBockCD21
António S Pinto, Sebastian Böck, Jaime S Cardoso, and Matthew EP Davies. User-driven fine-tuning for beat tracking. Electronics, 10(13):1518, 2021.
- PLV+19
Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuo-Yiin Chang, and Tara Sainath. Deep learning for audio signal processing. IEEE Journal of Selected Topics in Signal Processing, 13(2):206–219, 2019.
- Rab89
Lawrence R Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989.
- Sch98
Eric D Scheirer. Tempo and beat analysis of acoustic musical signals. The Journal of the Acoustical Society of America, 103(1):588–601, 1998.
- SchluterBock14
Jan Schlüter and Sebastian Böck. Improved musical onset detection with convolutional neural networks. In 2014 ieee international conference on acoustics, speech and signal processing (icassp), 6979–6983. IEEE, 2014.
- SUM20
H. Schreiber, J. Urbano, and M. & Müller. Music tempo estimation: are we done yet? Transactions of the International Society for Music Information Retrieval, 3(1):111–123, 2020. doi:110.5334/tismir.43.
- SP97
Mike Schuster and Kuldip K Paliwal. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673–2681, 1997.
- Set05
Burr Settles. Abner: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics, 21(14):3191–3192, 2005.
- SP03
Fei Sha and Fernando Pereira. Shallow parsing with conditional random fields. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, 134–141. Association for Computational Linguistics, 2003.
- SBD16
Siddharth Sigtia, Emmanouil Benetos, and Simon Dixon. An end-to-end neural network for polyphonic piano music transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(5):927–939, 2016.
- SjobergL95
Jonas Sjöberg and Lennart Ljung. Overtraining, regularization and searching for a minimum, with application to neural networks. International Journal of Control, 62(6):1391–1407, 1995.
- SHCS15
A. Srinivasamurthy, A. Holzapfel, A. T. Cemgil, and X. Serra. Particle filters for efficient meter tracking with dynamic bayesian networks. In 16th Int. Society for Music Information Retrieval Conf. (ISMIR). 2015. URL: http://hdl.handle.net/10230/34998.
- SHCS16
A. Srinivasamurthy, A. Holzapfel, A. T. Cemgil, and X. Serra. A generalized bayesian model for tracking long metrical cycles in acoustic music signals. In IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 76–80. Shanghai, China, March 2016.
- SHS17
Ajay Srinivasamurthy, Andre Holzapfel, and Xavier Serra. Informed automatic meter analysis of music recordings. In ISMIR-International Conference on Music Information Retrieval. 2017.
- SHS14
Ajay Srinivasamurthy, André Holzapfel, and Xavier Serra. In search of automatic rhythm analysis methods for turkish and indian art music. Journal of New Music Research, 43(1):94–114, 2014.
- SR21
Christian J Steinmetz and Joshua D Reiss. Wavebeat: end-to-end beat and downbeat tracking in the time domain. arXiv preprint arXiv:2110.01436, 2021.
- SM06
C. Sutton and A. McCallum. An introduction to conditional random fields for relational learning. In Lise Getoor and Ben Taskar, editors, Introduction to Statistical Relational Learning, chapter 4, pages 93–128. MIT Press, Cambridge, USA, 2006.
- SM12
Charles Sutton and Andrew McCallum. An introduction to conditional random fields. Foundations and Trends® in Machine Learning, 4(4):267–373, 2012.
- USchluterG14
Karen Ullrich, Jan Schlüter, and Thomas Grill. Boundary detection in music structure analysis using convolutional neural networks. In ISMIR, 417–422. 2014.
- VDWK17
Richard Vogl, Matthias Dorfer, Gerhard Widmer, and Peter Knees. Drum transcription via joint beat and drum modeling using convolutional recurrent neural networks. In ISMIR, 150–157. 2017.
- W+90
Paul J Werbos and others. Backpropagation through time: what it does and how to do it. Proceedings of the IEEE, 78(10):1550–1560, 1990.
- WCG06
N. Whiteley, A. T. Cemgil, and S. J. Godsill. Bayesian modelling of temporal structure in musical audio. In 7th Int. Society for Music Information Retrieval Conf. (ISMIR). Citeseer, 2006.
- ZNY19
missing journal in zahraybeat
- ZDGomez14
José R Zapata, Matthew EP Davies, and Emilia Gómez. Multi-feature beat tracking. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(4):816–825, 2014.
- ZZH+17
Chen Zhu, Yanpeng Zhao, Shuaiyi Huang, Kewei Tu, and Yi Ma. Structured attentions for visual question answering. In Proceedings of the IEEE International Conference on Computer Vision, 1291–1300. 2017.
- TheanoDTeam16
Theano Development Team. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints, May 2016. URL: http://arxiv.org/abs/1605.02688.