Coder Social home page Coder Social logo

moss.py's Introduction

moss.py

A Python client for Moss: A System for Detecting Software Similarity

Introduction

It is a Python interface for Moss client. It was written for AutoGrader for handling similarity in Python assignment submission.

It was written using the original bash script/documentation and its PHP dialect.

Installation

pip install mosspy

Usage

import mosspy

userid = 987654321

m = mosspy.Moss(userid, "python")

m.addBaseFile("submission/a01.py")
m.addBaseFile("submission/test_student.py")

# Submission Files
m.addFile("submission/a01-sample.py")
m.addFilesByWildcard("submission/a01-*.py")

# progress function optional, run on every file uploaded
# result is submission URL
url = m.send(lambda file_path, display_name: print('*', end='', flush=True))
print()

print ("Report Url: " + url)

# Save report file
m.saveWebPage(url, "submission/report.html")

# Download whole report locally including code diff links
mosspy.download_report(url, "submission/report/", connections=8, log_level=10, on_read=lambda url: print('*', end='', flush=True)) 
# log_level=logging.DEBUG (20 to disable)
# on_read function run for every downloaded file

Python Compatibility

Similar Project

License

This project is licensed under the MIT License - see the LICENSE file for details

moss.py's People

Contributors

bruceiv avatar cbachhuber avatar eric-vader avatar johnyka avatar joseppinilla avatar jpetrucciani avatar liziwl avatar llayman avatar nareddyt avatar nedchu avatar soachishti avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

moss.py's Issues

Documentation!

Why doesn't the project have a decent documentation? I had to read the source code to figure out the library.

It seems like m.send() doesn't work, what should I do? Thanks a lot!

I downloaded the zip and installed the package successfully
โžœ sudo pip install moss.py-master.zip

But when I run moss_usage.py, I got this error:
โžœ moss.py-master python moss_usage.py
Report URL:
Traceback (most recent call last):
File "moss_usage.py", line 20, in
m.saveWebPage(url, "submission/report.html")
File "/Users/moss.py-master/mosspy/moss.py", line 143, in saveWebPage
raise Exception("Empty url supplied")
Exception: Empty url supplied

It seems like m.send() doesn't work, what should I do? Thanks a lot!

Download complete report locally

Currently saveWebPage(...) only save the main report page locally where as there are more pages on level 2 which are "code diffs". We should also download those locally.

Problem with empty URL being returned

I'm having a problem where "url = m.send(lambda file_path, display_name: print('', end='', flush=True))" returns an empty URL. I discovered that when I remove a file from the collection of files sent to MOSS, all is well. I've attached the file
cell-a8c22df1236cd837@[email protected]
p.s. this file actually ends with ".source" extension, but github refused to attach a file with that suffix!

If I comment out the following line in this file:
df4 = pd.DataFrame(pca.components_[0:2], columns = feature_names, index=['PC1','PC2'])
then everything works correctly. I'm not sure why the inclusion of this file causes mosspy to fail. All works well when submitting directly to moss:
./moss -l python /Users/bdr/antiplag/kct-zips/543s21ASSN1TST2000/*.source
any advice?

I get a broken pipe

When I follow the example I get a broken pipe error which I do not know how to handle from the error messages. Any ideas?

>>> import mosspy
>>> from glob import glob
>>> 
>>> uid = '240793999'
>>> m = mosspy.Moss(uid, "python")
>>> 
>>> files = glob("Hand-in 5/submissions/*/*.py")
>>> basefile = glob("Hand-in 5/code/*.py")
>>> 
>>> [m.addBaseFile(bfile) for bfile in basefile]
>>> m.addFilesByWildcard("Hand-in 5/submissions/*/*.py")

>>> url = m.send(lambda file_path, display_name: print('*', end='', flush=True))
**********Traceback (most recent call last):
  File "submission_unzip.py", line 13, in <module>
    url = m.send(lambda file_path, display_name: print('*', end='', flush=True))
  File "/home/fuzie/.pyenv/versions/appcheck/lib/python3.8/site-packages/mosspy/moss.py", line 131, in send
    self.uploadFile(s, file_path, display_name, index, on_send)
  File "/home/fuzie/.pyenv/versions/appcheck/lib/python3.8/site-packages/mosspy/moss.py", line 104, in uploadFile
    s.send(message.encode())
BrokenPipeError: [Errno 32] Broken pipe

I have verified that both files and basefiles are added to the m.files and m.basefiles

Empty url supplied.

File "testing.py", line 19, in
m.saveWebPage(url, "submission/report.html")
File "C:\Users\anves\Desktop\moss.py-master\moss.py-master\mosspy\moss.py", line 143, in saveWebPage
raise Exception("Empty url supplied")
Exception: Empty url supplied

MOSS not working?

I've set up everything according to the documentation by getting the user id from the automated email and and it's showing no results when I use moss_usage.py as is with the user id replaced with my own

image

is anyone else experiencing this issue?

Exception on empty file

When there is an empty file in the directory being added e.g. an empty __init__.py. ` mosspy crashes with the following error:

 File "similar_code_detector_main.py", line 160, in <module>
    moss_ignorelimit=args.moss_ignorelimit
  File "projects/similar-code-detector/similar_code_detector.py", line 53, in detect_similar_code
    m.addFile(f)
  File "projects/similar-code-detector/venv/lib/python3.7/site-packages/mosspy/moss.py", line 82, in addFile
    raise Exception("addFile({}) => File Not Found".format(file_path))
Exception: addFile(/redacted/path/__init__.py) => File Not Found

Changing the getsize() check in moss.py:79 to >= 0 lets everything run through fine.

Adding Base Files By Wildcard

This doesn't seem like a supported feature currently, and would be nice to have. I'm just getting started with Moss, so I'm not sure if there is some reason for not supporting this?

When bs4 installed without lxml, download failed

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

Solution: pip install lxml

It is great to mention this in README

Returning empty url when submitting large number of code files

Number of code files I have submitted: 2066
Moss configuration parameters I have used as below

  1. Ignore limit (-m): 2066
  2. Number of matching files (-n): 2066 * 2066

With the above configuration, I have got an empty report URL. But when I used Ignore limit (-m) as 10 then I got the report URL.
Is there any limit on Ignore limit (-m) parameter?

Problem with cpp submission

Here is my script,


try:
    import mosspy
except:
    print("You are missing a dependency, try running")
    print("pip install mosspy")
    print("You might need to use pip3 and/or sudo depending on your OS")
from pathlib import Path
import sys

# This ID is the default one, I use my own when running
userid = 987654321

languageMap = {"java": [".java"], "cpp": [".cpp", ".h"]}

# for compatibility with Windows and those who don't know about the execution flag
n = 2 if sys.argv[0].startswith("py") else 1

language = sys.argv[n] if len(sys.argv) > n else input("Please enter the language (java or cpp): ")

mossLanguage = "cc" if language == "cpp" else language

m = mosspy.Moss(userid, mossLanguage)

pathlist = []

for extension in languageMap[language.lower()]:
    pathlist += Path(".").glob('**/*' + extension)

for path in pathlist:
    if "__MACOSX" not in str(path):
        m.addFile(str(path))

print(len(pathlist), "files found")
print(pathlist)

print("Sending Report, this might take a while...")
print("Consider Grabing some Coffee or a Snack.")

url = m.send()

print ("Report Url: " + url)

I also used cpp as the language and get the same problem.
The script gets stuck with: "recv = s.recv(1024)"
Any help would be nice.

moss connection errors

Hi,

I am having trouble using MOSS. I am a new MOSS and mosspy user.

When I run the original Perl script, I get the line:
Query submitted. Waiting for the server's response.
and then nothing else happens.

When I try to run my mosspy script on Linux, I get the following error at the line url = m.send():
ConnectionResetError: [Errno 104] Connection reset by peer

When I run the same mosspy script on Windows, I get the following error at the line url = m.send():
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

In all cases I am using my own userid, which I get from the MOSS perl script that was emailed me.

Any idea what might be going wrong?

Many thanks,

Barry.

Lambda function invalid syntax

Error on:

  File "test.py", line 19
    url = m.send(lambda file_path, display_name: print('*', end='', flush=True))
                                                     ^
SyntaxError: invalid syntax

Not sure whats going on. Using Python 2.7

MOSS connection error

./moss.pl -l python mst.py
Checking files . . .
OK
Could not connect to server moss.stanford.edu: Connection refused

I dont know if I am doing something wrong but this is the first time this is happening to me

Connection reset by peer

I keep getting this error when I try to run the script. The input size is large, could this be the problem? I don't think it's a connectivity issue because it works on smaller input size.
Thanks!

Traceback (most recent call last):
File "moss_script.py", line 370, in
main()
File "moss_script.py", line 203, in main
run_mosspy(args, args.language, base_files, student_files)
File "moss_script.py", line 269, in run_mosspy
url = moss.send()
File "/homes/z/zhan1486/.local/lib/python2.7/site-packages/mosspy/moss.py", line 129, in send
self.uploadFile(s, file_path, display_name, index)
File "/homes/z/zhan1486/.local/lib/python2.7/site-packages/mosspy/moss.py", line 103, in uploadFile
s.send(message.encode())
socket.error: [Errno 104] Connection reset by peer

Explain: Moss also excludes all code that appears in too many of the submitted programs.

Moss also excludes all code that appears in too many of the submitted programs. Thus, all matches reported by Moss fairly accurately approximate the signature of plagiarized code: a passage of similar code in two programs that does not also appear in very many other programs.

What are the criteria for an above-mentioned statement?
I have run with 41 files and 6 files in those are similar. but in result, those 6 files results are not coming.
But when I executed with only 6 files i got results.

Received empty url after moss ran for 10 minutes

As the title says, when I sent about 150 cpp in a submission,
the error prompts after doing nothing for about 10 minutes.
I'm pretty sure that my userid is correct, and I can reach moss.standford.edu:7690
(I tested with nc -v -u -z -w 30 moss.stanford.edu 7690 : Connection to moss.stanford.edu 7690 port [udp/*] succeeded!)

I've tried quite a few times yesterday and today and it's still not working.
Any idea why this is happening?

0 size file causing issues

For my use case, there is a 0 sized file as part of the submission which causes an exception from the following recent change:
and os.path.getsize(file_path) > 0

I'm wondering what the purpose of this additional condition is and whether it's necessary?

Connection refused error

Hi I just started using moss again using a new username.
My user id is
$userid=917327608;
I am running the moss as such
./moss.pl -l java -d SER_216_Assignment2_Submissions//.java
and I have the following error message
Checking files . . .
OK
Could not connect to server moss.stanford.edu: Connection refused
Can u pls help out?

Ayca

addBaseFile() doesn't seems to work

Adding the base file submission/a01.py doesn't seems to work. The code is not ignored by Moss

def code_holder(num1, num2):
    v = 0
    #### YOUR CODE GOES HERE ####
    #### make sure you save the return value in the variable: v
    



    #### DO NOT WRITE AFTER THIS LINE

    return v

gggg

Am I missing something here ?

Moss not working

mosspy is stuck on m.send (where m is the mosspy.Moss() object)

100 submissions/day limit

UPDATES

Nov 13, 2022, Due to persistent use of robots to submit tens of thousands of Moss jobs per day, the submission limit of 100 submissions/day/user is now being enforced.

Does anybody know if "100 submissions" refer to 100 files or 100 "runs" of the script?

I know that my script is correct, because I already successfully processed a couple of our tasks, but now I keep getting all kind of errors:

  • [Errno -3] Temporary failure in name resolution
  • [Errno -5] No address associated with hostname
  • [Errno 101] Network is unreachable
  • [Errno 104] Connection reset by peer
  • [Errno 111] Connection refused
  • sometimes the script just hangs for hours, and I am forced to kill it

So now I am wondering if I am being throttled/banned. I am 99.9% sure, that I didn't make more than 100 individual submissions, because my script sleeps for 30 minutes between each retry, and so it shouldn't have exceeded 48 submissions per day (assuming that "submission" means a single batch of files and not each individual file).

I seriously don't understand, why is the Moss server so flaky. The official explanation is that there is a lot of load on the servers, which is understandable, but if 90% of submission attempts end in failures due to the flaky server, then most of that load comes from retrying the same submission over and over again instead of actually processing new submissions.

Doesn't Work!

Crashes with the log below.

  File "main.py", line 163, in <module>
    main()
  File "main.py", line 159, in main
    sendToMoss()
  File "main.py", line 141, in sendToMoss
    url = moss.send()
  File "/Library/Python/2.7/site-packages/mosspy/moss.py", line 109, in send
    s.connect((self.server, self.port))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 61] Connection refused

Not working with cpp files

import mosspy
import config

user_id = 867764230
m = mosspy.Moss(user_id, "cc")

working_dir = config.source_codes_dir

m.addBaseFile(working_dir + "a_plus_b.cpp")
#m.addBaseFile("submission/test_student.py")

Submission Files

m.addFile(working_dir + "a_plus_b_1.cpp")
m.addFile(working_dir + "a_plus_b.cpp")

url = m.send() # Submission Report URL

print ("Report Url: " + url)

Save report file

#m.saveWebPage(url, "submission/report.html")

Download whole report locally including code diff links

#mosspy.download_report(url, "submission/report/", connections=8)

moss didn't find any matching lines, but these two codes are similar

Eventual ConnectionResetError on file upload

I've been using MossPy for a while and I love it. It has worked consistently for me in the past. But recently, it has started crashing when I try to compare more than a few files. I've reproduced this 10 times now, and it no longer ever finishes my program without crashing.

Code:

m = mosspy.Moss(userid, "python") #Removed userid for privacy

m.addBaseFile("desktop.ini") #Don't have an actual base file, so use this... it's worked in the past.

#Note: this is about 30 files, the largest of which is a few hundred lines of code.
m.addFilesByWildcard("Critters - Milestone 2/*/Lion.java") 

url = m.send(lambda file_path, display_name: print('*', end='', flush=True))

print ("Report Url: " + url)

It prints 15-20 asterisks before it crashes, meaning it's: (1) getting through over half the files before crashing, and (2) getting through a different number of files each time, implying it's not crashing because of a particular file.

Stacktrace:

File "run_moss.py", line 32, in <module>
    url = m.send(lambda file_path, display_name: print('*', end='', flush=True))
  File "C:\Users\amanda.oneal\Anaconda3\lib\site-packages\mosspy\moss.py", line 131, in send
    self.uploadFile(s, file_path, display_name, index, on_send)
  File "C:\Users\amanda.oneal\Anaconda3\lib\site-packages\mosspy\moss.py", line 104, in uploadFile
    s.send(message.encode())
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

Help?

Sent Web Page Not Found

I've tried running MOSS a few times this morning. When I receive the generated HTML page back from MOSS I'm getting a 404 not found when I try to go to that page.

I've tried about a dozen times this morning and keep getting 404 for all the newly generated pages. I have used this script many times successfully in the past.

Question about the moss connection

When I use the original moss shell script to upload some codes, it is stuck on the 'OK' and after a very long time (almost 5 mins) it will continue uploading the files. But it returned nothing after the 'Waiting for the server's response'.
Any idea what should I do?
Thanks a lot.

Moss not working?

Is moss not working?

Trying to run addFiles and no terminal output is being seen with the * being printed. My code was not changed in any way.
All I get is:
image

Returning empty URL

Hello, I've had success with the program in the past until recently. I used to submit about 45 files at one go, but currently, it only works with only 2.

I did not change anything on my computer, and I've supplied the correct userid. I've checked with telnet that I can successfully connect to moss.stanford.edu at port 7690

Here is the error that returns:


Exception Traceback (most recent call last)
in
19
20 #Save report file
---> 21 m.saveWebPage(url, "report/7pm_report.html")
22
23 # Download whole report locally including code diff links

~\AppData\Local\Continuum\anaconda3\lib\site-packages\mosspy\moss.py in saveWebPage(self, url, path)
142 def saveWebPage(self, url, path):
143 if len(url) == 0:
--> 144 raise Exception("Empty url supplied")
145
146 response = urlopen(url)

Exception: Empty url supplied

Is It Possible To Transform Your Code?

thank for developed this package.
I'm in position with develop cheating detect pages and going to apply mosspy package.
but one thing I missed was our autograder saves assignment in to our database ,not any folders...

So, I fixed moss.py module which can submit byte-encoded string to your server(not open containing folder and send to your server ) but it returns broken pipeline error.

I have few questions about why this happen

  1. Is it possible to use only moss.py module?
  2. If it possible, what could return error? upload_file function seems going wrong, should I look at input parameter in it?

Compare Sub Folder

how to compare subfolders with other folders.
i found an option of directory mode but what are the possible values for it to compare folders

Getting Empty URL

Report URL:
Traceback (most recent call last):
File "/home/ib-developer/Desktop/Roopa/Data_Analytics_utils/MOSS/moss.py-1.0.8/moss_usage.py", line 20, in
m.saveWebPage(url, "MOSS/moss.py-1.0.8/submission/report.html")
File "/home/ib-developer/Desktop/Roopa/Data_Analytics_utils/MOSS/moss.py-1.0.8/mosspy/moss.py", line 144, in saveWebPage
raise Exception("Empty url supplied")
Exception: Empty url supplied

I used this code

import mosspy

userid = 987654321

m = mosspy.Moss(userid, "python")

m.addBaseFile("MOSS/moss.py-1.0.8/submission/a01.py")
m.addBaseFile("MOSS/moss.py-1.0.8/submission/test_student.py")

# Submission Files
m.addFile("MOSS/moss.py-1.0.8/submission/a01-sample.py")
m.addFilesByWildcard("MOSS/moss.py-1.0.8/submission/a01-*.py")


url = m.send()

print ("Report URL: " + url)

# Save report file
m.saveWebPage(url, "MOSS/moss.py-1.0.8/submission/report.html")

mosspy.download_report(url, "submission/report/", connections=8, log_level=10) # logging.DEBUG (20 to disable)

Modify UploadFile

Hello
I perfer to using filename_fixed = file.replace(" ", "_") instead of filename_fixed = os.path.basename(file).replace(" ", "_").

In my case, I have many folders in submission folder, but there is file which have same name(MainActivity.java) in different folder. So the result report can not tell me which folder it is in.More importantly, in report, the different link MainActivity.java point to the same MainActivity.java!

Directory Structure Example

                      - No.1 - MainActivity.java
                      |
Submission          - - No.2 - MainActivity.java
                      |
                     -  No.3- MainActivity.java

If you agree, I would create my pull request :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.