ooni / probe Goto Github PK

View Code? Open in Web Editor NEW

755.0 71.0 142.0 326 KB

OONI Probe network measurement tool for detecting internet censorship

Home Page: https://ooni.org/install

License: BSD 3-Clause "New" or "Revised" License

ooniprobe

probe's Introduction

Android, iOS, Desktop, CLI

Download the desktop app

OONI Probe is free and open source software designed to measure internet censorship and other forms of network interference.

Every month, thousands of networks are measured by OONI Probe users in more than 200 countries. Since 2012, millions of network measurements have been published from around the world.

OONI Probe is available for the following platforms:

Android: probe-android (Play Store, F-Droid)
iOS: probe-ios (App Store)
Desktop App: probe-desktop
Command Line tool: probe-cli

OONI Probe tests are implemented in Go in github.com/ooni/probe-cli.

OONI Probe tests the lists of websites included in the citizenlab/test-lists repository.

OONI Probe used to be written in python. The legacy python version of OONI Probe is available here: probe-legacy.

To learn more about OONI, check out our website: https://ooni.org/

probe's People

Contributors

Stargazers

Watchers

Forkers

isislovecruft funsim rafiot nikoomba duy jonmtoz flavioamieiro shadiesna graingert hellais ingydotnet meejah ioerror samsmith bhitov rrana nathan-at-least alexwebr not-the-nsa david415 tomwills 0xhaven olafmk browserblade glamrock gntnbn nosmo salarcon215 nirs irl revollat joelanders kdm9 igueths dantheta alexandregz 0xpoly d33tah mikeaddison93 cephurs icaas karthikeyan-kkk w8mej kcadkins juga0 noscripter bytemaker81 s-york anarcat uaalto tiredadmin neuroidss kwadronaut skmezanul courado andresazp mzirintu guyhughes seamustuohy jameskumar ecneladis agrabeli pleiatheus prayagverma universal-it-systems notyours180 hanlimo vmon laurfan91 samdney willscott cgruppo greg5678 archer-sys darkk gvsurenderreddy thenavigat tardummy01 awesome-security acidburn0zzz securitywarrior ioef paddymahoney jakubd shivai krodyx thepigeonoftime michaeldarius rosesinthemath tparks5 alexxnica kryndex pirater 451hackathon digideskio luceatnobis mckakusi xrmx equalitie jamiepg4

probe's Issues

HTTP & DNS Parsers

We have parsers for the HTTP & DNS test reports.

Create a data flow diagram related to the m-lab deployment of ooni

We would like to have a DFD of how the data flow works in the mlab data pipeline.

This is once the report is stored locally in our ooni backend collector how does it end up published on cloud storage and big table.

@stephen-soltesz can you take care of this?

DNS Host Resolution

Add DNS host resolution to tls_handshake.py and input processor for a URI list.

Data Format Documentation

What data is collected on M-Lab?
How will Ooni data be converted for use with Big Query.
What is the file format of the report files written by the backend, and received from the client.

Handle Backend Failures Gracefully

And notify the user.

What if I want to use something that is new to 2.7, should I just take the code I need and put it into utils? I ask because I want the namedtuple, OrderedDict, and Counter classes from the collections module but the later two classes are new to 2.7.

Installation, Configuration, & Use Manual

A guide for downloading, setting up, and using Ooni, including an explanation of data privacy and consent.

Fix path to https://ooni.torproject.org/inputs/input-pack.tar.gz

The Makefile in ooni-probe/inputs/ points at the file https://ooni.torproject.org/inputs/input-pack.tar.gz, but this is 404'd.

M-Lab Integration Testing

Deployment works, updates work, mlab-ns works...

Threat Model: Add "Collateral Infrastructural Damage" Category to Threat Taxonomy

Synopsis

Issues such as #133 represent a risk not currently captured by the Threat Taxonomy so we need a new category, probably under "Resource Abuse". The category "Resource Abuse" should be renamed to "Resource Risks" to generalize it to encompass unintentional problems.

Also, the Leveraged Attacks under Resource Risks (was: ~~Resource Abuse~~) in the Threat Taxonomy has a FIXME comment about unintentional DOS.

Close Criteria

☑ Rename "Resource Abuse" to "Resource Risks"
☑ Add Collateral Infrastructure Damage to that category.
☑ Close #133 and Unintentional DOS threat.
☐ brainstorm other possibilities.

Related Issues

#133

Policy on net-tests that perform out-of-spec requests, e.g. "HTTP Invalid Request Line"

An "out-of-spec message" is a protocol message that is not valid according to the relevant protocol specification (e.g. RFC 2616 for HTTP).

Despite Postel's principle ("Be generous in what you accept"), we know that in practice network software often fails to be robust against unexpected messages. If the request is not designed to be an attack and has no malicious payload, the bad effects are normally limited to denial of service to other clients. Let's call a network component (origin server, proxy, middlebox) "fragile" if receiving an out-of-spec message causes it to fail to give correct service to other clients.

The policy question for this issue is whether Ooni-probe 1.0 should ship with net-tests that perform out-of-spec requests. (Whether Ooni-b can relay out-of-spec requests is also important, but not in the scope of this issue.)

Of the candidate tests in issue #89, the only one that sends out-of-spec messages according to its description seems to be "HTTP Invalid Request Line".

Here are the arguments I see against including such net-tests:

a) The operators of network components affected by out-of-spec messages may view them as attacks.

Note that many countries, including Western liberal democracies, have computer misuse laws that cover activities that are perceived to be exploiting bugs in network software. It may be difficult for someone suspected under such a law to defend themself, especially if the suspicion draws attention to their other activities. There have been cases where people were convicted under computer misuse laws for sending out-of-spec messages (or messages that were perceived to be out-of-spec), even when there is considerable doubt that they intended to perform an attack, e.g. http://www.theregister.co.uk/2005/10/11/tsunami_hacker_followup/. The risk in countries where the rule of law is less consistently applied is only likely to be worse.

It's also possible that other network users could be misidentified as having
originated the probe, and similarly be viewed as attackers.

b) Effects on fragile network components can deny service to other network users.

Suppose, for instance, that a censoring proxy fails because it receives an out-of-spec message. It may fail "open" (i.e. let through subsequent requests) or "closed" (i.e.
fail to correctly relay subsequent requests). If it fails closed, then the probe has
had the effect of making the censorship worse for other network users, at least in
the short term, which is obviously counterproductive.

This can happen whether or not the fragile component was part of a system
of network interference. The description of "HTTP Invalid Request Line" seems
to make the implicit assumption that only "interfering" network components are
likely to be fragile. This is wrong; transparent/caching HTTP proxies,
firewalls that are not intended to be interfering, and origin servers, can also
realistically be fragile.

c) Effects on fragile network components can result in misleading measurements.

In principle, any active network test can change the behaviour of the network.
In fact such changes are one of the things we want to measure! However, if the
intent is to measure network behaviour that could in principle have been
encountered by non-test clients in normal operation, that could result in
misleading overreporting of network interference.

Documenting these problems for specific tests only partially addresses point a), since a user can't really be expected to have enough information to determine their risk of being viewed as an attacker. It does not address points b) and c) at all.

If ooni-probe supports running such tests but the results are not stored by a given
collector (e.g. MLab's collectors), that would address point c) for that collector,
but not points a) and b).

(There are some other net-tests that send atypical messages that are clearly in-spec, but that seems to be much less of an issue; other client software will occasionally send messages that are atypical in the same way.)

URL Input Processors

Write test input processors for every test that can use a URI in a meaningful way.

Fix Timeouts

Tests should not stall forever.

Design Specifications

~~How Ooni uses obfs and bridges.~~
~~Design justification for use of Tor hidden services as control and reporting channel.~~
~~Specification for what an Ooni test is.~~
~~Enumeration and specification of unit tests~~
?? Description of startup temp file recovery process.
~~Known limitations of the Ooni framework.~~

Diagrams

~~Backend internal DFD~~
~~Probe internal DFD~~
~~Backend <-> probe wire protocol~~
~~M-Lab reporting DFD~~
~~Scheduler diagram~~

Agree & Document Ooni Release Process

Threat Model: Change `mlab-ns` role to generic Directory Service Operator

Synopsis

Currently the Role Definitions include mlab-ns Operator.

Close Criteria

Close this ticket after this role is changed to a more generic Directory Service Operator, and all wiki references are updated. Issues specific to mlab-ns should be retained as examples, so that the Threat Model is still useful for MLab.

Is non-determinism in test helper deployment or MLab-ns API acceptable?

Close this ticket with a yes / no.

The MLab initialize.sh script for Ooni selects which test helpers bind to a given port randomly. The requirement is for the same port to provide multiple distinct test helpers, so the current strategy is to partition the MLab slices (and thus IP addresses) for each port according to how many helpers require that port. The random selection accomplishes this in a stateless / configuration-free manner.

Meanwhile, the probe will use the mlab-ns web service to request test helpers and a collector prior to running a net-test. This service currently responds non-deterministically (with various constraints and prioritizations such as scoring based on load).

The question is: Are these two sources of non-determinism a problem?

For scientific repeatability, randomness adds noise. For diagnostic reasons, determinism can make it simpler to understand logs or report data. For security reasons, censors might be able to game non-determinism in a way to favor particular test results. It may be that none of these concerns are strong enough (also considering the dev cost of removing the non-determinism).

If the answer is "no", there's a dev cost implication for mlab-ns which should be coordinated with MLab.

ooni test decks specifying logfile path but it is not used.

Jake reports that ooni is ignoring test .deck logfile paths.

Test/Helper Versioning Spec

Specify the procedure for moving to a new version of a nettest and test helper, and the mechanism for a server to notify a client about obsolete or dangerous tests/helpers.

Measure the code coverage of the ooni probe unittests

As we agreed in #107 we should assess how much code coverage we are reaching with our unittest (by using a tool like coverage) and possible integrate it with coveralls.io.

Here are some suggestions by @nathan-at-least

Some handy automated tools are:

API documentation generator from python doc strings - so that anyone can browse the names and intent of particular tests.
Coverage analysis - see coverage which can generate html reports of which lines of application code are exercised by unit tests. This is a quick way to notice untested portions of code.
Test Bots - Setting up a bot to run unit tests then generate an html report for various revisions and platforms can quickly show regressions.

Make http_requests record the Tor exit address used

It would be handy if the http_requests test also recorded the IP and nickname of the Tor exit IP that the http fetch occurred over, as we would get exit scanning for free.

Triage Test Inclusion

Review the list of proposed tests for Ooni's initial release, and decide which to include, and which to defer.

Specify ooni-backend logic for enforcing collection policy

If a backend has specified a collection policy it should enforce the policy.

Document how the ooni reporting state machine is changed, if at all.
Document how the ooni CREATE report API is changed, if at all.

https://github.com/TheTorProject/ooni-probe/issues/115

Test Specifications

Specification of each test helper.
Specification of each test.
Human-usable explanations of the semantic output of each test.

Make Sphinx dev docs available on Github pages

clock skew

When the clock on a tor client is so wrong that tor network consensus can not be reached, exit with a user comprehensible error, rather than hanging forever.

test

test issue

Finalize Supported Tests

Complete the list of tests that will be deployed with the initial release.

Threat Model

Ooni has a complete and useful threat model.

test

Threat Model: Add inter-role reliances.

Synopsis

The Role Definitions currently have many FIXME comments for Reliances.

Close Criteria

This ticket should be closed when every role's reliances are specified. The Organizational Reliances are a set of statements of the form: "Role X relies on Role Y for ..."

Related Issues

#134 "Threat Model: Clarify terminology for reliance distinctions."

Data Pipeline

Ooni is in the M-Lab data pipeline.

Threat Model: Incorporate old impacts into Taxonomy and Impact table.

Synopsis

There are Unincorporated Impacts below the new Impacts table.

Close Criteria

Add those rows to the Threat Taxonomy where relevant.
Add any new threats to the Impacts table.
Some of the legacy rows are now represented by multiple threats. Review the legacy role and the associated new threat impacts, and ensure no impact details are lost.
Comment which legacy roles were "distributed" among the new impacts, or which ones were removed as irrelevant on this ticket and in the wiki changelog.

exceptions.ImportError: cannot import name _WrappingProtoco

d@d:~/ooni-probe$ ./bin/ooniprobe nettests/blocking/dnstamper.py -f
hosts.txt
WARNING: Failed to execute tcpdump. Check it is installed and in the PATH
Log opened.
[D] No test deck detected
[D] processing options
[D] Checking if backend is present
[D] Checking if file is present
Starting Tor...
[D] Setting control port as 26177
[D] Setting SOCKS port as 45954
[D] 10%: Finishing handshake with directory server
[D] 15%: Establishing an encrypted directory connection
[D] 20%: Asking for networkstatus consensus
[D] 25%: Loading networkstatus consensus
[D] 45%: Asking for relay descriptors
[D] 50%: Loading relay descriptors
[D] 53%: Loading relay descriptors
[D] 57%: Loading relay descriptors
[D] 61%: Loading relay descriptors
[D] 64%: Loading relay descriptors
[D] 68%: Loading relay descriptors
[D] 72%: Loading relay descriptors
[D] 76%: Loading relay descriptors
[D] 80%: Connecting to the Tor network
[D] 85%: Finishing handshake with first hop
[D] 90%: Establishing a Tor circuit
[D] 100%: Done
[D] Building a TorState
Successfully bootstrapped Tor
[D] We now have the following circuits:
[D] * <Circuit 1 BUILT [194.109.206.212] for GENERAL>
[D] * <Circuit 2 BUILT [88.198.100.230] for GENERAL>
[D] * <Circuit 3 BUILT [195.242.152.250] for GENERAL>
[D] * <Circuit 4 BUILT [195.191.16.63] for GENERAL>
[D] * <Circuit 5 BUILT [91.206.27.30] for GENERAL>
[D] * <Circuit 6 BUILT [94.126.178.1] for GENERAL>
[D] * <Circuit 7 BUILT [188.40.32.154] for GENERAL>
[D] * <Circuit 8 LAUNCHED [] for GENERAL>
[D] * <Circuit 9 BUILT [87.106.249.118] for GENERAL>
[D] * <Circuit 10 BUILT [31.172.30.4] for GENERAL>
[D] * <Circuit 11 BUILT [176.65.109.60] for GENERAL>
[D] * <Circuit 12 BUILT [173.246.82.97] for GENERAL>
[D] * <Circuit 13 BUILT [178.86.31.41] for GENERAL>
[D] * <Circuit 14 BUILT [37.130.227.133] for GENERAL>
[D] * <Circuit 15 BUILT [173.254.216.69] for GENERAL>
[D] * <Circuit 16 BUILT [80.237.226.75] for GENERAL>
[D] * <Circuit 17 BUILT [31.172.30.2] for GENERAL>
[D] * <Circuit 18 BUILT [31.172.30.1] for GENERAL>
[D] * <Circuit 19 BUILT [85.25.108.113] for GENERAL>
[D] * <Circuit 20 BUILT [204.45.185.164] for GENERAL>
[D] * <Circuit 21 BUILT [62.141.42.149] for GENERAL>
[D] * <Circuit 22 BUILT [37.130.227.134] for GENERAL>
[D] * <Circuit 23 BUILT [85.214.73.63] for GENERAL>
[D] * <Circuit 24 LAUNCHED [] for GENERAL>
[D] * <Circuit 25 BUILT [166.70.154.130] for GENERAL>
[D] * <Circuit 26 BUILT [193.10.227.195] for GENERAL>
[D] * <Circuit 27 BUILT [85.25.110.235] for GENERAL>
[D] * <Circuit 28 BUILT [68.169.35.102 5.135.176.63 46.167.245.50] for
GENERAL>
[D] * <Circuit 29 EXTENDED [68.169.35.102 83.212.98.169] for GENERAL>
[D] * <Circuit 30 EXTENDED [68.169.35.102] for GENERAL>
[D] Obtained our IP address from a Tor Relay None
[D] Running [(<class 'nettests.blocking.dnstamper.DNSTamperTest'>,
'test_a_lookup')]
[D] Options {'inputs': <ooni.nettest.inputProcessorIterator object at
0xa8280cc>, 'version': '0.4', 'name': 'DNS tamper'}
[D] cmd_line_options {'pcapfile': None, 'help': 0, 'subargs': ('-f',
'hosts.txt'), 'resume': 0, 'parallelism': '10', 'test':
'nettests/blocking/dnstamper.py', 'logfile': None, 'collector': None,
'reportfile': None}
[D] testsEnded: Finished running all tests
Unhandled error in Deferred:
Unhandled Error
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py",
line 489, in _startRunCallbacks
self._runCallbacks()
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py",
line 576, in _runCallbacks
current.result = callback(current.result, _args, *_kw)
File "/home/d/ooni-probe/ooni/oonicli.py", line 107, in runTestList
d1 = runner.runTestCases(test_cases, options, cmd_line_options)
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py",
line 1214, in unwindGenerator
return _inlineCallbacks(None, gen, Deferred())
--- ---
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py",
line 1071, in _inlineCallbacks
result = g.send(result)
File "/home/d/ooni-probe/ooni/runner.py", line 416, in runTestCases
oonib_reporter = OONIBReporter(cmd_line_options)
File "/home/d/ooni-probe/ooni/reporter.py", line 271, in init
from ooni.utils.txagentwithsocks import Agent
File "/home/d/ooni-probe/ooni/utils/txagentwithsocks.py", line 15, in

from twisted.internet.endpoints import TCP4ClientEndpoint,
SSL4ClientEndpoint, _WrappingProtocol, _WrappingFactory
exceptions.ImportError: cannot import name _WrappingProtocol

Main loop terminated.

Twisted version:

d@d:~/ooni-probe$ ./bin/ooniprobe nettests/blocking/dnstamper.py
--version
WARNING: Failed to execute tcpdump. Check it is installed and in the PATH
Log opened.
[D] No test deck detected
Twisted version: 12.3.0

The_WrappingProtocol class is not anymore in the last version of Twisted (stable, 12.3.0):

https://twistedmatrix.com/documents/current/api/twisted.internet.endpoints.html
https://twistedmatrix.com/documents/12.1.0/api/twisted.internet.endpoints.html

Test/Helper Versioning

Implement backend rejecting probe's request for a test helper with a notification of obsolescence or risk.

Threat Model: Add definitions and examples for correlation threats.

Synopsis

The threat category Deanonymizing Data Correlation lacks definitions and examples.

Close Criteria

Add definitions to each.
Add examples.

Implement input whitelisting in oonib collector

It may be useful for a certain collector to specify which inputs it supports. This allows the creation of topic specific collectors (e.x. the collector on news websites, the collector on blogs, the collector for Italy, etc.).

When a input that is not supported by the specified collector it should reply with an error.

The collector should also expose an HTTP API where you can download all the supported inputs.

This is related to: TheTorProject/ooni-probe#109

Contributor Bootstrap

How should a contributor get started helping out with Ooni? What documentation should they read? What are good projects for them to tackle?

Data Privacy Review

Measurement Lab and Least Authority complete a privacy audit of the data probes submit and M-Lab publishes, and we limit our initial release to the safe subset.

Threat Model: Add "injection attacks" in non-report data section.

Synopsis

The Threat Taxonomy already includes injection attacks, but it is specific to data within reports. There is also a potential threat of injection attacks in non-report data, such as in HTTP server logs of probe target webservers.

Close Criteria

Add an injection attack bullet inside "Bad Non-Report Data".

Implement ooni-probe handling of collector policy

When a input that is not supported by the specified collector we should output a message and prompt the user to input an alternative collector.

https://github.com/TheTorProject/ooni-probe/issues/114
https://github.com/TheTorProject/ooni-probe/issues/115
https://github.com/TheTorProject/ooni-probe/issues/116

Backend Specification

Produce specification/design-document ooni-backend.

Test Failure Handling

Test failures should be reported correctly by the probe.

Add support for looking up test helper addresses

ooni-probe needs to get the addresses of test helpers and collectors. The M-Lab NS will handle listing all the test helpers running and the .onion address of the collector running on that machine.

An example of how this service works can be found here: https://mlab-ns.appspot.com/neubot?format=json.

We are making the assumption that there is no test that requires more than one test helper that binds on the same port. (i.e. HTTP Return JSON Header and TCP Echo test helpers can not be used both at the same time in a nettest)

Implement ooni-backend collector API for exposing test decks and inputs

An ooni-probe could learn about test decks and available input lists from a collector and provide a UI for choosing and running these tests.

A collector operator would specify the experiments they would like to collect in the form of a set of test decks with accompanying input lists, and a probe operator would be able to select one of the available experiments and then perform the measurements.
#115

Threat Model: Clarify terminology for reliance distinctions.

Synopsis

We need different terminology for these distinct kinds of statements:

"Role A relies on Role B for behavior ..."
"Role A relies on Component X for ..."
"Component X relies on Component Y for ..."

Close Criteria

This issue can be closed when both of these are satisfied:

There are clear names for the three distinct categories.
All of the wiki documentation is updated to use those terms.

Details

Legacy: The wiki already uses this terminology:

"Role A relies on Role B for behavior ..." - This is already called a Role's reliances.

Suggested terminology:

"Role A relies on Role B for behavior ..." - An Organizational Reliance
"Role A relies on Component X for ..." - A UX Reliance (where UX is short for usability)
"Component X relies on Component Y for ..." - A Technical Dependency

Specify the ooni-backend HTTP API for defining and exposing collection policy

Determine what should be specified -- input sets, specific tests, test decks?
Determine a syntax for how policy is specified.
Determine where (on which system/component/file) policy will be specified.
Specify any additional state between ooni-probe and ooni-backend.

The API should expose the inputs that are supported by the collector backend and the list of test decks that are curated by the collector

Mlab-ns Operational

M-Lab's name system is ready for Ooni usage.

Specify how ooni-probe handles ooni-backend collector policy.

When a input that is not supported by the specified collector we should output a message and prompt the user to input an alternative collector.

https://github.com/TheTorProject/ooni-probe/issues/109
https://github.com/TheTorProject/ooni-probe/issues/115
https://github.com/TheTorProject/ooni-probe/issues/116

Twisted's doRequest returning empty response.body?

Lovely OONI people, README.md did not recommend which trac to use, so here I am.

I have been playing with porting some personal testing scripts to the framework and come to appreciate a limitation in doRequest. It seems that unless a responding server returns a very specific set of headers, twisted records that response.body is an empty string. After hours of nginx annoyance, I found through running a simple node.js that at least 'Content-Type', e.g. 'text/html; charset=iso-8859-1', had to be set (if not 'Trailer' or others). This affects http_requests.py as a substantial number of sites return empty bodies over both connections.

ooni / probe Goto Github PK

probe's Introduction

Android, iOS, Desktop, CLI

probe's People

Contributors

Stargazers

Watchers

Forkers

probe's Issues

Diagrams

Recommend Projects

Recommend Topics

Recommend Org