CS 682 Speech Processing

Home/CS 682 Speech Processing
CS 682 Speech Processing 2017-07-19T13:51:41+00:00

About the course:

You will master machine learning and signal processing skills.  We will apply this to recognizing speech and speaker identity, but many of the skills that you will acquire are useful in many contexts such as finance, bioinformatics, control systems, etc.

Upon successful completion of this class, students should be able to:

  • Understand and implement feature extraction for human speech
  • Have an understanding of human speech production and perception.
  • Understand the importance of language constituencies and how they can be modeled.
  • Learn and be able to implement time-independent and time-dependent machine learning algorithms.
  • Be able to write a scientific paper.
  • Be well-equipped to understand readings in the speech technologies literature.

The prerequisites for this course are: Computer Science 310, Mathematics 254, and Statistics 551A.  As most CS students will not have taken Statistics 551A, this will be waived for any student who is willing to spend a bit of time learning the statistics, the basics of which will be covered in class.

Textbooks  Since the introduction of deep learning into speech processing by Dahl et al. (2010) and Deng et al. (2010) the field has changed rapidly and deep learning is now the dominant method used in speech processing applications.  As textbooks on deep learning are just starting to come out, we will be using a deep learning book supplemented with readings on speech.

Required:

  • Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep learning (The MIT Press, Cambridge, Massachusetts), pp. xxii, 775 pages.  Availble in print at bookstore or freely available online.

Programming exercises will be implemented using Python 3.5 and Tensorflow.  While we will briefly introduce Python in class, we will not be devoting much time to how to program in Python as computer science graduate students (and advanced undergraduates) should be able to pick up new languages fairly easily.  Optional books to help you with this are freely available from SDSU Library through Safari Technical Books:

  • Martelli, A., Ravenscroft, A., and Holden, S. (2017). Python in a Nutshell, 3rd Edition (O’Reilly Media, Inc, Sebastapol, CA) or
  • Reitz, K. (2016). The Hitchhiker’s Guide to Python: Best Practices for Development (O’Reilly Media, Sebastopol)

In addition, the python.org’s tutorial is also quite good.