Coder Social home page Coder Social logo

haskell-tor's Introduction

A Tor Implementation in Haskell

Build Status

This version of haskell-tor is (C) 2015 Galois, Inc., and distributed under
a standard, three-clause BSD license. Please see the file LICENSE,
distributed with this software, for specific terms and conditions.

What is Tor?

Tor is a secure onion routing network for providing anonymized access to both the public Internet as well as a series of Tor-internal hidden services. Much more information about Tor can be found at https://www.torproject.org.

Many thanks to all the hard work that project has put into developing and evangelizing Tor.

What is in this repository?

This repository contains a Tor implementation in Haskell. It is eventually designed to be a fully-compliant Tor implementation, but at the moment lacks some features:

  • Support for finding or implementing hidden services.
  • Proper flow-control support.
  • Statistics updating.
  • Directory server support.

Using this library as an entrance node (i.e., to create anonymized connections to hosts on the Internet) is fairly well tested and should be functional. Relay and exit node support is implemented but much less well tested. For whichever use case you have, please report any problems you find to the GitHub issue tracker.

Building haskell-tor

This library uses cabal as its build system, and should work for Mac, Unix, and HaLVM-based installations. Windows support may work ... we just haven't tested it.

Understanding Network Stacks

The haskell-tor library is built such that it can use one of two built-in network stacks and/or a third-party network stack that you provide. How you get each of these is governed by two flags that correspond to the two network stacks:

  • network ensures that haskell-tor includes defaults for the standard, sockets-based network stack as described in the Haskell network library.

  • hans ensures that haskell-tor includes defaults for the Haskell Network stack, which is a clean-slate networks stack that runs off raw Ethernet frames.

The defaults are a little complicated. To help try to sort things out, here is a table that describes all the combinations of flags, and what the default is for each platform:

Default Platform network hans Meaning
Normal True True Support for both hans and network
* Normal True False Support only network
Normal False True Support only hans
Normal False False No network stack support (BYONS)
HaLVM True True Support only hans (network ignored)
HaLVM True False No network stack support (see prev.)
* HaLVM False True Support only hans
HaLVM False False No network stack support (BYONS)

Standard Cabal Constraints

If you're building with the HaLVM, please add the constraints --constraint "tls +hans", --constraint "tls -network", and -f-network to your build flags, and if you're using the integer-simple library (for example, to avoid GPL entanglements with unikernels), you should add the constraints --constraint "cryptonite -integer-gmp", --constraint "scientific +integer-simple" and --constraint "scientific < 0.3.4.1".

In either case, we strongly suggest using sandboxes to keep everything nice and tidy.

Important Note

This is an early implementation of Tor that has not been peer-reviewed. Those with a true, deep need for anonymity should strongly consider using the mainline Tor client until and unless this version receives appropriate extensions, testing, and review.

Usage

As with most Haskell packages, this package can either be used as a library or as a binary package. Currently, the executable binary will simply perform an example get from whatismyip.com. Extending this to support a wider range of features is an open issue.

haskell-tor's People

Contributors

acw avatar elimisteve avatar erikd avatar iphydf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

haskell-tor's Issues

Clean up Hans integration / configuration / building

There's a lot of Cabal flags floating around haskell-tor, cryptonite, and tls, and they're making everything a bit of a pain. I'm not sure how to fix things, but if we can figure out a way to fix them that would be great.

Configuration File / Command Line Support

The executable built with haskell-tor is currently just a simple demo of pulling down data via Tor. I would, however, like to eventually expand this to be a replacement (albeit not a drop-in replacement) for the mainline Tor executable.

A first step towards this would be to allow people to define all the options in TorOptions through a configuration file and/or the command line arguments.

With regard to configuration file format I'm not particularly fussy, except that I sit next to the author of the config-value package at work and he made a pretty strong argument for it once. Plus I can bug the hell out of him if it doesn't work right.

Flow Control (Circuit-level Flow Control)

In Section 7.3 of spec/tor-spec.txt, they define a flow control mechanism which is supposed to keep nodes from overwhelming each other with traffic. The existing system tries to implement this with the tsReadWindow field in TorSocket in src/Tor/Circuit.hs. However, I'm pretty certain that (a) this is the wrong place to do this, and (b) I implemented it wrong.

The incorrect existing behavior hasn't caused any problems ... yet. But I suspect it will, someday. So this is marked as a feature now, but should be marked as a bug should people find it causes problems.

Windows Networking Support

My understanding is that the current implementation works well for standard POSIX systems, OS/X, and the HaLVM via HaNS. However, I believe I remember seeing an email or issue fly past that it didn't work so well on Windows. I also remember, at the time, glancing at Google and finding out that getting the network library to work on Windows was an adventure.

We should see if there's an alternate networking library to use and link against that, or see if there's anything we can do to simplify getting haskell-tor running on Windows.

"no relay/exit support" when running default executable

This is the output I get when running the default exe generated by a successful build:

[2016-03-26 12:16] Credentials created.
[2016-03-26 12:16] Signing key fingerprint: 8134b7fba2aa271884c7ccf9ac4314b056f68784
[2016-03-26 12:16] Onion key fingerprint: 6cb4771e317699b5f06497a18488314131a78787
[2016-03-26 12:16] TLS key fingerprint: b111b5ce1c241663f8d6c8c22a7c16cc96c55cfa
[2016-03-26 12:16] Fetching credentials for default directory moria1 [128.31.0.39:9131]
[2016-03-26 12:16] Fetch failed: Fetch timed out.
[2016-03-26 12:16] Fetching credentials for default directory tor26 [86.59.21.38:80]
[2016-03-26 12:16] Fetching credentials for default directory dizum [194.109.206.212:80]
[2016-03-26 12:16] Fetching credentials for default directory Tonga [82.94.251.203:80]
[2016-03-26 12:16] Fetch failed: Parser error: Failed reading: HTTP Error: Not found ["\r\n"]
[2016-03-26 12:16] Fetching credentials for default directory gabelmoo [131.188.40.189:80]
[2016-03-26 12:16] Fetching credentials for default directory dannenberg [193.23.244.244:80]
[2016-03-26 12:16] Weird: fingerprint mismatch. Ignoring dir.
[2016-03-26 12:16] Fetching credentials for default directory Faravahar [154.35.175.225:80]
[2016-03-26 12:16] Fetching credentials for default directory longclaw [199.254.238.52:80]
[2016-03-26 12:16] 5 of 8 default directories loaded.
[2016-03-26 12:16] String consensus document update.
[2016-03-26 12:16] Using directory dizum for consensus.
[2016-03-26 12:16] Waiting for Tor connections on port 9002
[2016-03-26 12:16] Added new directory entry for dannenberg
[2016-03-26 12:16] Added new directory entry for maatuska
[2016-03-26 12:16] Added new directory entry for urras
[2016-03-26 12:17] Failed to add new directory for moria1
[2016-03-26 12:17] Found enough valid signatures: dannenberg, tor26, longclaw, maatuska, urras, dizum, gabelmoo, Faravahar
[2016-03-26 12:17] New router processing complete. 2950 of 6177 routers available.
[2016-03-26 12:17] Will reload census at 13:53
[2016-03-26 12:17] Trying to connect to 84.75.230.35
[2016-03-26 12:17] Just built connection with them.
[2016-03-26 12:17] Created new link to 84.75.230.35 ("OdyX")
[2016-03-26 12:17] Built initial link to "84.75.230.35" with circuit ID 4172906913
[2016-03-26 12:17] Created circuit 4172906913
[2016-03-26 12:17] Extending circuit to "78.22.238.222"
[2016-03-26 12:17] Extending circuit to "176.9.4.206"
[2016-03-26 12:17] Extending circuit to "14.136.236.129"
[2016-03-26 12:17] Extending circuit to "86.219.120.200"
[2016-03-26 12:17] Extending circuit to exit node "94.228.86.11"
[2016-03-26 12:17] I believe I have the following addrs: [IP4 "149.18.82.104"]
[2016-03-26 12:17] Trying to connect to 62.143.98.61
[2016-03-26 12:17] Just built connection with them.
[2016-03-26 12:17] Created new link to 62.143.98.61 ("Unnamed")
[2016-03-26 12:17] Built initial link to "62.143.98.61" with circuit ID 3743545541
[2016-03-26 12:17] Created circuit 3743545541
[2016-03-26 12:17] Extending circuit to "212.47.254.99"
[2016-03-26 12:17] Extending circuit to "166.70.207.2"
[2016-03-26 12:17] Extending circuit to "212.227.20.116"
[2016-03-26 12:17] Extending circuit to "176.9.148.176"
[2016-03-26 12:17] Extending circuit to exit node "198.50.200.139"
[2016-03-26 12:17] Failed to create connection to myself. No relay/exit support.

I have bolded the lines of output I am unsure about-- are these errors, or are some directories expected to not be fetched correctly? Do I have to configure a port on my laptop to act as an exit node? Thanks in advance!

security considerations

I'm just throwing this out here for discussion since this seems to be one of the first real-world implementations of crucial security-related applications in haskell.

I've asked on haskell-cafe ML before: how much do we actually know about haskell to say whether it's safe to use for such a use case? What problems (e.g. explicit memory management) do we face here and need to be concerned about? Can haskell introduce additional attack vectors (e.g. like timing problems in the JVM)?

Also a few posts about this or similar topics:
https://mail.haskell.org/pipermail/haskell-cafe/2015-February/118059.html
http://www.leonmergen.com/haskell/crypto/2015/03/21/on-the-state-of-cryptography-in-haskell.html

Hidden Services (Discovery)

While some of the message stubs are in the current tree, the current code base does not support finding hidden services. We should add this.

Based on a quick skim several months ago, this seems like it might be just a quick additional handshaking protocol, so I'm marking it small. Feel free to remark it larger if it looks like a bigger task, although if you do so, it'd be great if you could leave notes regarding why you think it's bigger.

Flow Control (Link Throttling)

In Section 7.1 of src/tor-spec.txt, they describe a way to perform some flow control at the link level. It's not well described, but there is some reference to link priorities as well as a proposal (111).

We should check for any updates that have happened on this proposal, and then implement any link throttling that makes sense.

Incremental download / decompress / parse chain

This bug may require changes to the pure-zlib library, hence the "medium."

Right now, I believe there's still a memory spike every time the node downloads a consensus or directory document from a server, because it downloads the whole file, decompresses it, and then parses it. It would be very handy if we could instead pull down a chunk, and feed it through all three bits at the same time.

HTTP client

Is there a way to integrate into existing HTTP client libraries? Socket usage for HTTP requests is a bit inconvenient :-).

Clean Up TorAddress

Of all the shortcuts I took in the source code, perhaps the one that fills me with the most regret is TorAddress. I mean, seriously. Strings?

It's not clear to me if the data structure even makes any sense except that those are possible answers to a DNS query. It would likely be extremely valuable to, first off, create useful types for IP4 and IP6 addresses, and then clean up all the crap around addresses in the source code.

I'm marking this as medium mostly because it's pervasive, not because it seems super hard.

No instance for Network.TLS.Backend.HasBackend Socket

I'm attempting to build this on linux (as a nomral executable, not a HalVM instance) and ran:

cabal sandbox init
cabal install --only-dependencies
cabal build

and I'm running into:

exe/Main.hs:143:25:
    No instance for (tls-1.3.3:Network.TLS.Backend.HasBackend Socket)
      arising from a use of ‘MkNS’
    In the expression: MkNS (hansNetworkStack ns)
    In the first argument of ‘return’, namely
      ‘(MkNS (hansNetworkStack ns), logger)’
    In a stmt of a 'do' block:
      return (MkNS (hansNetworkStack ns), logger)

Looks like the both versions of initializeSystem call hansNetworkStack. It seems like you always need to depend on hans to build the executable at the moment in case you want to use a tap device.

Consensus Document Flag Capture

The consensus document provided by the directory servers include a bunch of flags that turn optional features in the Tor protocol on or off. As a first step, we should figure out a way to make sure this information is available to the rest of the system, and then create a short description somewhere (maybe in this ticket?) of what the various flags are and how important we think they are to implement.

Testing Network Stack / Auto-generation of In-Memory Tor Networks

Haskell-tor explicitly defines its network stack as something that can be plugged in later. Which means that we could, just as easily, create a totally fake network. This would make testing some features a lot easier and more convenient.

The end goal would be to be able to randomly generate (via QuickCheck) a small Tor network, and then be able to define end-to-end properties regarding that network: that when you open a connection to a legit host through a Tor circuit, you get the entire data stream correctly, for example. We could then expand this end-to-end testing as we add features.

RNG threading through credentials code needs cleanup

In particular, there's some strangeness with the RNG created in newCredentials. I don't think it's necessarily unsafe, but it would be nice for testing reasons if this whole bit of code could be deterministic, and thus easier to replicate exactly through testing.

Build failure

Building with GHC 7.10.3 :

cabal sandbox init
cabal install --dependencies-only --enable-tests
cabal configure --enable-tests
cabal build

I've just started getting:

Preprocessing executable 'haskell-tor' for haskell-tor-0.1.2...
[1 of 3] Compiling Paths_haskell_tor ( dist/build/autogen/Paths_haskell_tor.hs, dist/build/haskell-tor/haskell-tor-tmp/Paths_haskell_tor.o )
[2 of 3] Compiling Tor.Flags        ( exe/Tor/Flags.hs, dist/build/haskell-tor/haskell-tor-tmp/Tor/Flags.o )
[3 of 3] Compiling Main             ( exe/Main.hs, dist/build/haskell-tor/haskell-tor-tmp/Main.o )

exe/Main.hs:169:21:
    No instance for (tls-1.3.4:Network.TLS.Backend.HasBackend Socket)
      arising from a use of `MkNS'
    In the expression: MkNS (hansNetworkStack ns)
    In the first argument of `return', namely
      `(MkNS (hansNetworkStack ns), logger)'
    In a stmt of a 'do' block:
      return (MkNS (hansNetworkStack ns), logger)

Goal: 100% test coverage

It would be great if every line in haskell-tor was tested with some reasonable test. Tor is a fairly complex beast, and so the more we can rely on automated testing, the better I'll feel about people using haskell-tor in truly sensitive situations.

Flow Control (Stream-level Flow Control)

Section 7.4 of spec/tor-spec.txt defines a simple mechanism for flow control for each steam in a circuit. The existing implementation includes a partial solution, via tsReadWindow in src/Tor/Circuit.hs, but I don't think it's quite right.

It would be nice to either verify that what exists there is right (unlikely) or fix it so it was.

Https support?

I wonder if there's an easier way to implement this than putting the full TLS spec in the connectToHost' function. Is that too optimistic? :)

Building with GHC 8.0

I had a quick look at building haskell-tor with ghc-8.0.1 and quickly ran into a minefield of problems.

The first problem was that the versions of cryptonite and memory specified in the cabal file specify the version of base that is less than what ships with 8.0. Then, if I bump those dependencies, cabal warns (--dry-run mode in a sandbox)

Warning: The following packages are likely to be broken by the reinstalls:
process-1.4.2.0
ghc-8.0.1
Cabal-1.24.0.0
haskeline-0.7.2.3
ghci-8.0.1
directory-1.2.6.2
hpc-0.6.0.3
ghc-boot-8.0.1
Use --force-reinstalls if you want to install anyway.

which I have never seen before and do not understand.

Even if I ignore all that and use --force-reinstalls (its a sandbox right?) the dependencies build ok, but then the Tor.Link module fails to compile, because there are undefined fields in the ClientParams struct. I can't find that struct in the haskell-tor sources, and can't figure out where else its coming from.

So, what are the plans for making this build with ghc-8.0?

Bandwidth Statistic Capture and Reporting

One of the things Tor nodes are supposed to do is capture statistics about their own bandwidth usage, so that they can report this information to others. See, for example, the bandwidth, read-history, and write-history fields mentioned in spec/dir-spec.txt.

It should be relatively easy to gather some of these statistics by wrapping the provided network stack object with one that measures bandwidth usage. Probably the trickiest bit will be figuring out how to do the math correctly without keeping too much history around, and figuring out how to most elegantly provide this information to the appropriate modules.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.