Course language: As in almost all of the courses offered with the SPAN prefix, this course is taught in Spanish. However, this initial information is in English because one of the goals of the course is to translate it into Spanish.
Overview: Natural Language Processing in Spanish teaches you how to use a computer to do useful things with the Spanish language. Hopefully you'll finish the semester with some practical knowledge about solving linguistic problems, such as techniques for filtering junk email, automatically discovering the different meanings of a word, automatically translating from one language to another, and identifying the author of a text from the statistics of the words that he or she uses. You will also become familiar with the computer programming language called Python, which is easy to learn and makes doing many tasks in natural language processing rather simple. It is great training if you are interested in doing natural language processing work in industry, either in a research lab (Google, Microsoft, Powerset, Yahoo, etc.) or in a startup. As part of the course, we will use Google translate to translate an English-language textbook on natural language processing to Spanish, with the help of a research group in Barcelona.
Objectives:
Outcomes: For you to demonstrate your attainment of these objectives, you will perform the following tasks:

Code of Academic Integrity
“The integrity of Newcomb-Tulane College is based on the absolute honesty of the entire community in all academic endeavors. As part of the Tulane University community, students have certain responsibilities regarding work that forms the basis for the evaluation of their academic achievement. Students are expected to be familiar with these responsibilities at all times. No member of the university community should tolerate any form of academic dishonesty, because the scholarly community of the university depends on the willingness of both instructors and students to uphold the Code of Academic Conduct. When a violation of the Code of Academic Conduct is observed it is the duty of every member of the academic community who has evidence of the violation to take action. Students should take steps to uphold the code by reporting any suspected offense to the instructor or the associate dean of the college. Students should under no circumstances tolerate any form of academic dishonesty.” For further information, point your browser at http://college.tulane.edu/honorcode.htm.
Violations of the Code of Academic Integrity will not be tolerated in this class. I will rigorously investigate and pursue any such transgression.
Students with disabilities who need academic accommodation should:
Date |
Day |
Topic |
Assignment |
ppt | mp3 | P |
Jan 11 (M) |
1 | Presentación del curso |
NLPP Preface | |||
13 (W) |
2 | La computación con el lenguaje | NLPP 1.1 | |||
15 (F) |
3 | La computación con el lenguaje | NLPP 1.1 | |||
18 (M) |
MLK Birthday |
|||||
20 (W) |
4 | Acercamiento a Python, La estadística |
NLPP 1.2 - 1.3 | |||
22 (F) |
5 | La estadística, Tomar decisiones y control |
NLPP 1.3 - 1.4 |
|||
| 25 (M) | 6 | Tomar decisiones y control, Comprensión automática del LN, Resumen | NLPP 1.4 - 1.6 | P1 | ||
27 (W) |
7 | Acceder a corpuses de texto |
NLPP 2.1 |
|||
29 (F) |
8 | Unicode |
NLPP 3.3 |
|||
Feb 1 (M) |
9 |
|
|
-- | -- | P2 |
3 (W) |
10 | Unicode 2 |
NLPP 3.3 |
|||
5 (F) |
11 | Distribución de frecuencia condicionada |
NLPP 2.2 |
|||
8 (M) |
12 | Más Python: Reciclar código |
NLPP 2.3 |
-- |
|
|
10 (W) |
13 | Acceso a textos locales y de la web |
NLPP 3.1 |
P3 | ||
12 (F) |
14 | Acceso a textos locales y de la web |
NLPP 3.1 - 3.2 |
-- | ||
15 (M) |
Lundi Gras |
|||||
17 (W) |
15 | Más sobre las cadenas, Las expresiones regulares |
NLPP 3.2, 3.4 |
|||
19 (F) |
16 | Aplicaciones de las expresiones regulares |
NLPP 3.5 |
|||
22 (M) |
17 | Más aplicaciones de las expresiones regulares |
NLPP 3.6 | P4 | ||
24 (W) |
18 | Normalizing & tokenizing text, segmentation, formatting, summary |
NLPP 3.7 - 3.10 | |||
26 (F) |
19 | Using a tagger, Tagged corpora |
NLPP 5.1 - 5.2 |
-- | ||
Mar 1 (M) |
20 | La asociación de palabras con propiedades con diccionarios de Python |
NLPP 5.3 |
P5 |
||
3 (W) |
21 | Automatic tagging |
NLPP 5.3- 5.4 |
|||
5 (F) |
22 | Automatic tagging, N-gram tagging, Trans-based tagging, Word category, Summary |
NLPP 5.4 - fin | |||
8 (M) |
23 | La clasificación supervisada 1 |
NLPP 6.1 |
P6 |
||
10 (W) |
24 | La clasificación supervisada 2 |
NLPP 6.1 |
|||
12 (F) |
25 | La clasificación supervisada 3 |
NLPP 6.1 |
|||
15 (M) |
26 | La clasificación supervisada 4 |
NLPP 6.1 |
|
||
17 (W) |
27 | La clasificación supervisada 5 |
NLPP 6.1 |
P7 | ||
19 (F) |
28 | La clasificación supervisada 6 |
NLPP 6.1 |
|||
22 (M) |
29 | La clasificación supervisada 7 |
NLPP 6.1 |
P8 |
||
24 (W) |
30 | La evaluación |
NLPP 6.3 |
|||
26 (F) |
31 | La extracción de información | NLPP 7.1 | -- | ||
29 (M) |
Spring Break |
|||||
31 (W) |
Spring Break |
|||||
Apr 2 (F) |
Spring Break |
|||||
5 (M) |
Spring Break |
|
||||
7 (W) |
32 | Chunking, Chunkers, Recursion, Names, Relations, Summary, Grammatical dilemmas, Syntax, CFG, Parsing, Dependencies, Grammar development, Summary, Grammatical features, Processing features, Extending the grammar, Extending the grammar, Summary |
NLPP 7.2 - 9.4 |
|||
9 (F) |
33 | NL understanding, Propositional logic |
NLPP 10.1 - 10.2 |
|||
12 (M) |
34 | FOL |
NLPP 10.3 |
P9 | ||
14 (W) |
35 | FOL, Semantics of sentences |
NLPP 10.3 - 10.4 | |||
16 (F) |
36 | Semantics of sentences, Discourse semantics, Summary |
NLPP 10.4 - 10.6 | |||
19 (M) |
37 | Corpus structure, Life cycle |
NLPP 11.1 - 11.2 | P10 | ||
21 (W) |
38 | Acquiring data, XML |
NLPP 11.3 - 11.4 | |||
23 (F) |
39 | Toolbox data, OLAC, Summary |
NLPP 11.5 - 11.7 | |||
26 (M) |
40 | The language challenge | Afterward | P11 | ||
May 6 (R) |
-- | FINAL EXAM DAY 8 - noon |
Present projects to class |