Coder Social home page Coder Social logo

computational-linguistics's Introduction

Computational Linguistics

As the information age has revolutionized the way human lives, we are experiencing information overload where the amount of data we encounter grows beyond the human capacity. Although multimedia became a substantial part of big data, text is still the most primitive yet dominating medium. This interdisciplinary course addresses how to extract information in need from text (Linguistics), make statistical analysis of the extracted information (Quantitative Theory and Methods ), and write computer programs to automate this process (Computer Science).

Prerequisites: CS 171 (CS and LING students) or QTM 220 (QTM students)

computational-linguistics's People

Contributors

jdchoi77 avatar sfillwo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

computational-linguistics's Issues

[QZ1] Spaces and dictionaries

For Quiz 1, can we assume that there is only one space between each word? So this case will not happen: "I have one hundred apples."
Also, should we create our own dictionaries?

Euclidean Distance

For 02/03 lecture, I'm not sure why we need to have this second line in euclidean function? Thank you.

def euclidean(x1: Dict[str, float], x2: Dict[str, float]) -> float:
t = sum(((s1 - x2.get(term, 0)) ** 2 for term, s1 in x1.items()))
t += sum((s2 ** 2 for term, s2 in x2.items() if term not in x1))
return t

[QZ#2] Question about the test case "'boy.n.01' and 'girl.n.01'"

Hello everyone!

I was playing with my paths method using different test cases, and I somehow found that the hypernym_paths() method call of "boy.n.01" returned two paths that both lead to the leaf "male_child.n.01" instead of "boy.n.01". "girl.n.01" works totally fine, but it didn't show 'female_child.n.01' anywhere in the hypernym paths.

Does it mean that I am using the hypernym_paths() method in a wrong way? Will this cause problems when my paths method is put through the test cases?

Please help.

[QZ1] Initial Code Import

For quiz1 I am having trouble with understanding the beginning part of this.
Quiz 1: String Matching Task states
Create a python file called quiz1.py under the quiz package and copy the code.
What code am I copying?
rk

Lecture Recordings

Where should I find the recording for yesterday's lecture? Thank you.

[QZ4] Test Cases Doesn't Work

It seems that the overlaps are removed before calling remove_overlap. I've tried "Atlantic City of Georgia" and it returns "Atlantic City" and "Georgia" before or after I call remove_overlap, for "South Korea United States" it returns "South Korea" and "United States", and for "United States of America" it returns "United States"; these results are before calling remove_overlap.

[QZ7] Quiz 7 inacessible

The link in the Canvas quiz which sends us to GitHub is broken (I see an Octocat 404 page), and the GitHub link which works for quiz 7 sends us to an "Access Denied" page in Canvas (perhaps because the link in Github sending me to Canvas is trying to access an unpublished assignment?)

404octocat
403canvas

Meeting passcode for office hours

Hello! I am trying to join the office hour but zoom is asking me to type in a passcode. Does anyone know what the passcode is? Thank you!!

Download PoS Tagging Data

Both of the files "wsj-pos.dev.gold.tsv" and "wsj-pos.trn.gold.tsv" are 404: Not Found. I've clicked on the links on the GitHub page that are supposed to direct to those data, but it still shows 404 Not Found. Should there be another link?

[QZ0] PyCharm License

For PyCharm, I am prompted with a screen that tells me to add a license while using my Emory email. Currently, it says I have no licenses available and I was wondering how I acquire them?

[QZ4] Remove Entities Conflict

There are two requirements to be met when we implement the remove_overlaps function: remove the minimum number of entities, and remove the shorter entities when those overlap with the longer ones. However, what should our program remove when the two conditions give us different instruction about removing the entities?

For example, given the span a b, b c d, and d e, if we remove the shorter entities and keep the longer ones, we should remove a b and d e; however, if we remove the minimum number of entities, we should remove b c d instead of a b and d e. Which condition should we go with when we encounter such cases?

[QZ3] Using NLTK for Quiz 3

Does any one know whether we are allowed to use NLTK for quiz 3? If we are allowed to use NLTK for quiz 3, would we lose points if the NLTK is not installed on the grader's computer and cause the program to fail its compiling process? (Considering that we are likely to lose all the points for a quiz if our program does not compute.)

[QZ1] normalize_extra

Hello! I am a little confused about what normalize_extra should do. My understanding is that after we put the text through the normalize method, where all a/an are transformed into "1", we then put the output into normalize_extra to transform part of the "1"s back to the original a/an. Is that what we are expected to do? Thank you!!

[QZ1] Word Grammar

The description mentioned that "twenty three hundred" should be translated to 2300, which should formally be written as "two thousand three hundred". I wonder can you give us more examples or expline those kinds of examples?

[QZ#] Quiz 2 correct values

Hello,
I know we touched on it in class, but I'm not sure how to determine whether the values my similar_documents() function returns are correct, along with my cosine() function. Is there somewhere we can reference what it should be outputting?
Thank you!

[QZ0] package or directory

Hello! I know in quiz 0 we need to create a package src/quiz for the quizes.
Screen Shot 2022-01-13 at 4 47 04 PM

But later it is said the src/quiz is a directory.
Screen Shot 2022-01-13 at 4 49 18 PM

I noticed when I click "add" in pycharm, I can choose to create a package or a directory. So I wonder which one shall I create? Thank you!!! :)

quiz0

I am having a problem creating a project (part of quiz0. I am getting a Java.Language.NullPointerException error when trying to create a new package called src/quiz/. Any ideas about how to address this?
rk

[QZ#1] Tokenizers from class?

For quiz 1, if we wish to tokenize the inputted text, are we allowed to use the tokenizer code from class or do we have to write our own?

Office Hours Passcode?

I'm trying to get into the 4pm office hours today, but it says I need a passcode to enter the zoom call.

[QZ0] Pushed .idea and venv by accident

Screenshot (36)

The first time I committed and pushed, I think my .gitignore didn't work right because everything, including the .idea and venv folders, got pushed to my repo. I tried it again but I can't figure out how to get rid of them. How do I make my repo have only the files it's supposed to have? Do I have to start over? (I'm fairly new to using github, so sorry if this seems like a silly question.)

Also, the src folder should be under the main cs329 directory, not under .idea, correct?

[Quiz0] Problem installing Jupyter Notebook

Hello,

I have attached screenshots of problems I am facing in Pycharm when doing the Quiz 0 Getting Started tasks. The 1st error appears when I enter the "pip install jupyter" command. In the 2nd screenshot, the "OK" button at the bottom right is greyed out when I try to select the base interpreter. What suggestions do you have for solving these problems?

Thanks.

term

pych1

[QZ1] git push

Hello! This is actually a trivial question XD But after I pushed my quiz1, as I checked my repository, I find my quiz0.py and quiz0.ipynb have been pushed again automatically as well. Is there a way to stop this from happening and push only quiz1.py?

Quiz 1 Questions

Are we allowed to complete this assignment with dictionaries that we made or do we have to use regular expressions only?

Office hours password?

Hi,
I tried to join office hours but it asked for a password. I couldn't find a password anywhere on the syllabus. Has anyone else had this issue?
Thanks!
Olivia

[QZ0] push rejected

Hello, when I click push there occurred an error message saying the push is rejected because remote changes need to be pushed before merging

Screen Shot 2022-01-14 at 1 32 29 AM

Here is what I got before the push. Under the push there is another choice called force push. Do I need to click force push instead of push? Thank you!!

Screen Shot 2022-01-14 at 1 33 00 AM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.