Coder Social home page Coder Social logo

mechanize's People

Contributors

jheintz avatar jlleitschuh avatar martinkersten avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mechanize's Issues

SequenceCollection / RandomSequenceCollection

For simulating different user different sequences of action should exist. These sequences can be managed in a single collection (SequenceCollection). The sequences can be removed, added, and get and also randomly being drawn.

User

While doing load testing a User needs to be simulated. A user here is doing a sequence drawn from a collection if the sequence ends another sequence will start until the user is stopped.

The stop will signal the agent performing the requests to stop running the sequence -> needs a way to halt a sequence.

UserPool

Doing load testing a collection of concurrent users needs to be setup and controlled. A user pool manages different users all randomly drawing sequences to perform from a sequence collection. The user pool uses an agent per simulated user.

This way one can create a load testing / integration testing environment for static and partly dynamic web pages and also for rest services once the rest / json support is added.

HTML5: Input Type=Email not supported

I'm trying to use mechanize to sign in to an amazon page, but running into an issue. For some reason it doesn't find the email input field that is on that page, yet does find the password, and a bunch others too. Here is the code I'm using, and I'm trying against master (3ddf8f8):

String amazonUrl = "http://www.amazon.com/gp/digital/fiona/manage/ref=gno_yam_myk";
MechanizeAgent agent = new MechanizeAgent();
Page signinPage = agent.get(amazonUrl );
Form form = signinPage.forms().get(0);
form.get("email"); // This is null
form.get("ap_email"); // This is null
form.get("password"); // This is not null

Looking at the form field in a debugger, the email input element doesn't seem to be there. The signinPage.asString contains:

<input id="ap_email" name="email" value="" type="email" size="30" maxlength="128" tabindex="1" autocorrect="off" autocapitalize="off" />

unit testing for expected exceptions

this test,

    @Test(expected=IllegalArgumentException.class)
    public void testExpectPost() throws Exception {
        agent.addPageRequest("POST", "http://test.com/form", newHtml("OK", ""));

        Page result = agent.get("http://test.com/form");
        assertEquals("OK", result.getTitle());

    }   

will succeed then fail (the test method accurately gets the exception, the MechanizeTestCase.afterTest() method then fails the test with

junit.framework.AssertionFailedError: Unexecuted page request: com.gistlabs.mechanize.MechanizeMock$PageRequest@1055e55f
    at junit.framework.Assert.fail(Assert.java:50)
    at com.gistlabs.mechanize.MechanizeTestCase.afterTest(MechanizeTestCase.java:33)

It would be nice to be able to have afterTest() detect if failure has already occurred, and then do nothing.

Doesn't work under an android emulator

Picking up from hijack of #31

Running some simple code to login to amazon under an android 4.1 emulator. The code works when run as plain java locally.

For some reason, the cookies in the response object aren't what they should be. May have something to do with the inclusion of httpclient as part of the sdk, but not sure where to start looking.

public void testSignIn() throws IOException {
    String username = "";
    String password = "";
    MechanizeAgent agent = new MechanizeAgent();

    String manageKindleUrl = "http://www.amazon.com/gp/digital/fiona/manage/ref=gno_yam_myk";
    Page signinPage = agent.get(manageKindleUrl);

    debug(signinPage);

    Form form = signinPage.forms().get(0);
    form.get("email").setValue(username);
    ((Checkable) form.get("ap_signin_existing_radio")).setChecked(true);
    form.get("password").setValue(password);
    Page managePage = form.submit();

    debug(managePage);

}

private void debug(Page page) {
    System.out.println("\n\n\n");
    System.out.println("**** Page Headers ****");
    System.out.println(page.getResponse().toString());
    System.out.println("**** Page Cookies ****");
    for (Cookie cookie : page.getAgent().cookies()) {
        System.out.println(cookie.toString());
    }
    System.out.println("**** Page Body ****");
    System.out.println(page.asString());
}

Support Apache ContentBody in post api

The current MechanizeAgent.post() API allows <String,String> arguments. This prevents sending a file through this api.

(Perhaps the post(String, Parameters) method supports this now... but not clear).

Suggest changing signature to:

    public Page post(String uri, Map<String, Object> params)

where we document Object can be one of: String, String[], or ContentBody (otherwise runtime exception thrown)

Support Batch Loading of New Images

To mimic browser behaviour (especially when load testing) it is necessary to load all images of a page. Since every browser has an image cache only images newly being encountered by the agent must be fetched.

so something like page.images().getMissing(ImageCollection). Within the image collection it can be stated (subclass) whether the images should be writen to buffer, saved to file or just marked as being fetched.

Using the same ImageCollection across multiple pages one can mimic the image loading browser behaviour for an entire sequence of actions.

Introduce AgentPool to Control Multiple Agents

Introduce AgentPool as a collection of agents that can be controlled as a whole.

new AgentPool(numberOfAgents)

The AgentPool should offer a listener service and interceptors:

Interceptors: before and after (agent, sequence), init(agent)
Listener: begin, end(agent, sequence)

This allows to measure timings for the duration of a sequence for a single user

AgentPool.process(SequenceCollection, duration) runs a load test / for the given amount of time.

Sequence

To simplify load testing and integration testing a user needs to be simulated. The user should have a behaviour which is described by a sequence of actions. The sequence will have (random) idle times applied simulating user waiting time during the different user actions.

agent.run(sequence) - Using an agent a sequence will be played

Consider hidden values as suggestions, not immutable

Evolvable, versionable APIs can take advantage of default values.

Reconsider:

com.gistlabs.mechanize.form.FormTest
    // TODO JDH: confirm that we should fail in this case... see versioning API for counter example
    @Test(expected = UnsupportedOperationException.class)
    public void testSettingValueOfHiddenInputFails() {

Offer Do(url) instead of get/post

agent.do(url).set(param).add(param).post/get() instead of simply using agent.get(url) or agent.post(url, new Parameters().set(x,x).add(x,y)).

Using the do method it is also possible to expose a get query builder that easily composes get parameter queries and should also be able to parse the already present parameters of the url.

Redirect doesn't happen on a form sign in

From the code in #34, after the form gets submitted, the resulting page is a redirect (with an empty body which is what cause the trace in #34). Shouldn't the result of the submit return a Page which is the result of performing the redirect?

JSoup version needs updating

I get a trace when running with jsoup 1.6.4 which goes away with 1.7.1 (from the code posted in #34)

java.lang.StringIndexOutOfBoundsException: String index out of range: 0
at java.lang.String.charAt(String.java:686)
at org.jsoup.helper.DataUtil.parseByteData(DataUtil.java:98)
at org.jsoup.helper.DataUtil.load(DataUtil.java:54)
at org.jsoup.Jsoup.parse(Jsoup.java:118)
at com.gistlabs.mechanize.html.HtmlPage.loadPage(HtmlPage.java:44)
at com.gistlabs.mechanize.Page.(Page.java:60)
at com.gistlabs.mechanize.html.HtmlPage.(HtmlPage.java:39)
at com.gistlabs.mechanize.html.HtmlPageFactory.buildPage(HtmlPageFactory.java:28)
at com.gistlabs.mechanize.MechanizeAgent.toPage(MechanizeAgent.java:151)
at com.gistlabs.mechanize.MechanizeAgent.request(MechanizeAgent.java:90)
at com.gistlabs.mechanize.RequestBuilder.post(RequestBuilder.java:122)
at com.gistlabs.mechanize.form.Form.submit(Form.java:296)
at com.gistlabs.mechanize.form.Form.submit(Form.java:276)
at com.gistlabs.mechanize.integration.test.AmazonSignInIT.testSignIn(AmazonSignInIT.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

Ensure copyright notice on source code

Put a comment in with this copyright heading. We'll add an open source license for rights before making public.

// Copyright (C) 2012 Gist Labs, LLC. All Rights Reserved.

Track progress of css-selectors pull request

The generalized glue code between css-selectors and JSON Elements is something that I'd like to see maintained in the css-selectors project.

If/when this pull request is accepted we should remove our own com.gistlabs.mechanize.json.query package entirely and use the css-selectors code.

See chrsan/css-selectors#9

Reworking Page: Resource, ImageResource, and a Choice of Page or Document

A single Page type is too limiting for a full RESTful multi-content client. The name Page is common and easily understood, but perhaps too ambiguous.

Suggested hierarchy of types:
-Resource // the root of all returned types from the web, has bytestream
|
|- ImageResource // also has getImage()
|

  • Document // any tree like document type, exposes CSS selectors and common data structure
    |
    |- HtmlDocument // a wrapper around JSoup
    |
    |- JsonDocument
    |
    |- XmlDocument
    |
    .......

@MartinKersten, I assume this is what you were suggesting. Please comment if this isn't what you were thinking about. I've been convinced :)

Specify the Accepted Media Types

For some requests an accept header must be set for accepting JSON or XML rather than HTML. This can be done using the do(url) method.

agent.do(url).accept("application/xml") for instance. Default is accepting html.

Parameters vs Body Content

We need to offer a non-multipart send. (for POST/PUT).

We currently have these post() signatures:

    public Page post(String uri, Map<String, String> params) throws ... {
    public Page post(String uri, Parameters params) {

I suggest adding the following as well:

    public Page post(String uri, Parameters params, byte[] body) {
    public Page post(String uri, Parameters params, InputStream body) {
    public Page post(String uri, Parameters params, File body) {

I anticipate the complexity will be in handling parameters (encoding and content type issues).

See for example this code, particularly the reset multipart data calls:
https://github.com/sonatype/async-http-client/blob/master/api/src/main/java/com/ning/http/client/RequestBuilderBase.java#L479

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.