Download Mechanize For Ruby Mac

  

API documentation for the mechanize Browser object.You can create a mechanize Browser instance as:

Ruby Reference is intended to be most full, actual and accessible language reference. Most of the reference content is taken directly from Ruby documentation and reorganized for easier reading. The core docs were augmented with some quotes from the Ruby website, and some missing content that is written specifically for the book. Live HTTP Headers Replay (Popularity: ): A Perl script that, given the output of the Firefox extension Live HTTP Headers, will replay the script using Test::WWW::Mechanize. WWW File Share (Popularity: ): WWW File Share is a software that can help you share files with your friends. What you need to do is to specify the path which contains files you want to share (for example: 'd:' or 'e:mp3. Aug 13, 2010.

Download mechanize for ruby mac download

Contents

  • Browser API
class mechanize.Browser(history=None, request_class=None, content_parser=None, factory_class=<class mechanize._html.Factory>, allow_xhtml=False)[source]

Browser-like class with support for history, forms and links.

BrowserStateError is raised whenever the browser is in the wrongstate to complete the requested operation - e.g., when back() iscalled when the browser history is empty, or when follow_link() iscalled when the current response does not contain HTML data.

Public attributes:

request: current request (mechanize.Request)

form: currently selected form (see select_form())

Parameters:
  • history – object implementing the mechanize.Historyinterface. Note this interface is still experimentaland may change in future. This object is ownedby the browser instance and must not be sharedamong browsers.
  • request_class – Request class to use. Defaults tomechanize.Request
  • content_parser – A function that is responsible for parsingreceived html/xhtml content. See the builtinmechanize._html.content_parser() function for detailson the interface this function must support.
  • factory_class – HTML Factory class to use. Defaults tomechanize.Factory
add_client_certificate(url, key_file, cert_file)

Add an SSL client certificate, for HTTPS client auth.

key_file and cert_file must be filenames of the key and certificatefiles, in PEM format. You can use e.g. OpenSSL to convert a p12 (PKCS12) file to PEM format:

openssl pkcs12 -clcerts -nokeys -in cert.p12 -out cert.pemopenssl pkcs12 -nocerts -in cert.p12 -out key.pem

Note that client certificate password input is very inflexible ATM. Atthe moment this seems to be console only, which is presumably thedefault behaviour of libopenssl. In future mechanize may supportthird-party libraries that (I assume) allow more options here.

back(n=1)[source]

Go back n steps in history, and return response object.

n: go back this number of steps (default 1 step)

click(*args, **kwds)[source]

See mechanize.HTMLForm.click() for documentation.

click_link(link=None, **kwds)[source]

Find a link and return a Request object for it.

Arguments are as for find_link(), except that a link may besupplied as the first argument.

cookiejar

Return the current cookiejar (mechanize.CookieJar) or None

find_link(text=None, text_regex=None, name=None, name_regex=None, url=None, url_regex=None, tag=None, predicate=None, nr=0)[source]

Find a link in current page.

Links are returned as mechanize.Link objects. Examples:

Links include anchors <a>, image maps <area>, and frames<iframe>.

All arguments must be passed by keyword, not position. Zero or morearguments may be supplied. In order to find a link, all argumentssupplied must match.

If a matching link is not found, mechanize.LinkNotFoundErroris raised.

Parameters:
  • text – link text between link tags: e.g. <a href=”blah”>thisbit</a> with whitespace compressed.
  • text_regex – link text between tag (as defined above) must matchthe regular expression object or regular expression string passedas this argument, if supplied
  • name – as for text and text_regex, but matchedagainst the name HTML attribute of the link tag
  • url – as for text and text_regex, but matched against theURL of the link tag (note this matches against Link.url, which is arelative or absolute URL according to how it was written in theHTML)
  • tag – element name of opening tag, e.g. “a”
  • predicate – a function taking a Link object as its singleargument, returning a boolean result, indicating whether the links
  • nr – matches the nth link that matches all othercriteria (default 0)
follow_link(link=None, **kwds)[source]

Find a link and open() it.

Arguments are as for click_link().

Return value is same as for open().

forms()[source]

Return iterable over forms.

The returned form objects implement the mechanize.HTMLForminterface.

geturl()[source]

Get URL of current document.

global_form()[source]

Return the global form object, or None if the factory implementationdid not supply one.

The “global” form object contains all controls that are not descendantsof any FORM element.

The returned form object implements the mechanize.HTMLForminterface.

This is a separate method since the global form is not regarded as partof the sequence of forms in the document – mostly forbackwards-compatibility.

links(**kwds)[source]

Return iterable over links (mechanize.Link objects).

open(url_or_request, data=None, timeout=<object object>)[source]

Open a URL. Loads the page so that you can subsequently useforms(), links(), etc. on it.

Parameters:
  • url_or_request – Either a URL or a mechanize.Request
  • data (dict) – data to send with a POST request
  • timeout – Timeout in seconds
Returns:

A mechanize.Response object

open_novisit(url_or_request, data=None, timeout=<object object>)[source]

Open a URL without visiting it.

Browser state (including request, response, history, forms and links)is left unchanged by calling this function.

The interface is the same as for open().

This is useful for things like fetching images.

See also retrieve()

reload()[source]

Reload current document, and return response object.

response()[source]

Return a copy of the current response.

The returned object has the same interface as the object returned byopen()

retrieve(fullurl, filename=None, reporthook=None, data=None, timeout=<object object>, open=<built-in function open>)

Returns (filename, headers).

For remote objects, the default filename will refer to a temporaryfile. Temporary files are removed when the OpenerDirector.close()method is called.

For file: URLs, at present the returned filename is None. This maychange in future.

If the actual number of bytes read is less than indicated by theContent-Length header, raises ContentTooShortError (a URLErrorsubclass). The exception’s .result attribute contains the (filename,headers) that would have been returned.

select_form(name=None, predicate=None, nr=None, **attrs)[source]

Select an HTML form for input.

This is a bit like giving a form the “input focus” in a browser.

If a form is selected, the Browser object supports the HTMLForminterface, so you can call methods like set_value(),set(), and click().

Another way to select a form is to assign to the .form attribute. Theform assigned should be one of the objects returned by theforms() method.

If no matching form is found,mechanize.FormNotFoundError is raised.

If name is specified, then the form must have the indicated name.

If predicate is specified, then the form must match that function.The predicate function is passed the mechanize.HTMLForm as itssingle argument, and should return a boolean value indicating whetherthe form matched.

nr, if supplied, is the sequence number of the form (where 0 is thefirst). Note that control 0 is the first form matching all the otherarguments (if supplied); it is not necessarily the first control in theform. The “global form” (consisting of all form controls not containedin any FORM element) is considered not to be part of this sequence andto have no name, so will not be matched unless both name and nr areNone.

You can also match on any HTML attribute of the <form> tag by passingin the attribute name and value as keyword arguments. To convert HTMLattributes into syntactically valid python keyword arguments, thefollowing simple rule is used. The python keyword argument name isconverted to an HTML attribute name by: Replacing all underscores withhyphens and removing any trailing underscores. You can pass in strings,functions or regular expression objects as the values to match. Forexample:

set_ca_data(cafile=None, capath=None, cadata=None, context=None)

Set the SSL Context used for connecting to SSL servers.

This method accepts the same arguments as thessl.SSLContext.load_verify_locations() method from thepython standard library. You can also pass a pre-built context via thecontext keyword argument. Note that to use this feature, you must beusing python >= 2.7.9. In addition you can directly pass ina pre-built ssl.SSLContext as the context argument.

set_client_cert_manager(cert_manager)

Set a mechanize.HTTPClientCertMgr, or None.

set_cookie(cookie_string)[source]

Set a cookie.

Note that it is NOT necessary to call this method under ordinarycircumstances: cookie handling is normally entirely automatic. Theintended use case is rather to simulate the setting of a cookie byclient script in a web page (e.g. JavaScript). In that case, use ofthis method is necessary because mechanize currently does not supportJavaScript, VBScript, etc.

The cookie is added in the same way as if it had arrived with thecurrent response, as a result of the current request. This means that,for example, if it is not appropriate to set the cookie based on thecurrent request, no cookie will be set.

The cookie will be returned automatically with subsequent responsesmade by the Browser instance whenever that’s appropriate.

cookie_string should be a valid value of the Set-Cookie header.

For example:

Currently, this method does not allow for adding RFC 2986 cookies.This limitation will be lifted if anybody requests it.

See also set_simple_cookie() for an easier way to set cookieswithout needing to create a Set-Cookie header string.

set_cookiejar(cookiejar)

Set a mechanize.CookieJar, or None.

set_debug_http(handle)

Print HTTP headers to sys.stdout.

set_debug_redirects(handle)

Log information about HTTP redirects (including refreshes).

Logging is performed using module logging. The logger name is“mechanize.http_redirects”. To actually print some debug output,eg:

Other logger names relevant to this module:

  • mechanize.http_responses
  • mechanize.cookies

To turn on everything:

set_debug_responses(handle)

Log HTTP response bodies.

See set_debug_redirects() for details of logging.

Response objects may be .seek()able if this is set (currently returnedresponses are, raised HTTPError exception responses are not).

set_handle_equiv(handle, head_parser_class=None)

Set whether to treat HTML http-equiv headers like HTTP headers.

Response objects may be .seek()able if this is set (currently returnedresponses are, raised HTTPError exception responses are not).

set_handle_gzip(handle)

Add header indicating to server that we handle gzipcontent encoding. Note that if the server sends gzip’ed content,it is handled automatically in any case, regardless of this setting.

set_handle_redirect(handle)

Set whether to handle HTTP 30x redirections.

set_handle_referer(handle)[source]

Set whether to add Referer header to each request.

set_handle_refresh(handle, max_time=None, honor_time=True)

Set whether to handle HTTP Refresh headers.

set_handle_robots(handle)

Set whether to observe rules from robots.txt.

set_handled_schemes(schemes)

Set sequence of URL scheme (protocol) strings.

For example: ua.set_handled_schemes([“http”, “ftp”])

If this fails (with ValueError) because you’ve passed an unknownscheme, the set of handled schemes will not be changed.

set_header(header, value=None)[source]

Convenience method to set a header value in self.addheadersso that the header is sent out with all requests automatically.

Parameters:
  • header – The header name, e.g. User-Agent
  • value – The header value. If set to None the header is removed.
set_html(html, url='http://example.com/')[source]

Set the response to dummy with given HTML, and URL if given.

Allows you to then parse that HTML, especially to extract formsinformation. If no URL was given then the default is “example.com”.

set_password_manager(password_manager)

Set a mechanize.HTTPPasswordMgrWithDefaultRealm, or None.

set_proxies(proxies=None, proxy_bypass=None)

Configure proxy settings.

Parameters:
  • proxies – dictionary mapping URL scheme to proxy specification.None means use the default system-specific settings.
  • proxy_bypass – function taking hostname, returning whether proxyshould be used. None means use the default system-specific settings.

The default is to try to obtain proxy settings from the system (see thedocumentation for urllib.urlopen for information about thesystem-specific methods used – note that’s urllib, not urllib2).

To avoid all use of proxies, pass an empty proxies dict.

set_proxy_password_manager(password_manager)

Set a mechanize.HTTPProxyPasswordMgr, or None.

set_request_gzip(handle)

Add header indicating to server that we handle gzipcontent encoding. Note that if the server sends gzip’ed content,it is handled automatically in any case, regardless of this setting.

set_response(response)[source]

Replace current response with (a copy of) response.

response may be None.

This is intended mostly for HTML-preprocessing.

set_simple_cookie(name, value, domain, path='/')[source]

Similar to set_cookie() except that instead of using acookie string, you simply specify the name, value, domainand optionally the path.The created cookie will never expire. For example:

submit(*args, **kwds)[source]

Submit current form.

Arguments are as for mechanize.HTMLForm.click().

Download

Return value is same as for open().

Download Mechanize For Ruby Machine Learning

title()[source]

Return title, or None if there is no title element in the document.

viewing_html()[source]

Return whether the current response contains HTML data.

visit_response(response, request=None)[source]

Visit the response, as if it had been open() ed.

Unlike set_response(), this updates history rather thanreplacing the current response.

class mechanize.Request(url, data=None, headers={}, origin_req_host=None, unverifiable=False, visit=None, timeout=<object object>, method=None)[source]

A request for some network resource. Note that if you specify the method as‘GET’ and the data as a dict, then it will be automatically appended to theURL. If you leave method as None, then the method will be auto-set toPOST and the data will become part of the POST request.

Parameters:
  • url (str) – The URL to request
  • data – Data to send with this request. Can be either a dictionarywhich will be encoded and sent as application/x-www-form-urlencodeddata or a bytestring which will be sent as is. If you use a bytestringyou should also set the Content-Type header appropriately.
  • headers (dict) – Headers to send with this request
  • method (str) – Method to use for HTTP requests. If not specifiedmechanize will choose GET or POST automatically as appropriate.
  • timeout (float) – Timeout in seconds

The remaining arguments are for internal use.

add_data(data)

Set the data (a bytestring) to be sent with this request

add_header(key, val=None)[source]

Add the specified header, replacing existing one, if needed. If valis None, remove the header.

add_unredirected_header(key, val)[source]

Same as add_header() except that this header will notbe sent for redirected requests.

get_data()[source]

The data to be sent with this request

get_header(header_name, default=None)[source]

Get the value of the specified header. If absent, return default

get_method()[source]

The method used for HTTP requests

has_data()[source]

True iff there is some data to be sent with this request

has_header(header_name)[source]

Check if the specified header is present

has_proxy()[source]

Private method.

header_items()[source]

Get a copy of all headers for this request as a list of 2-tuples

Download Mechanize For Ruby Mac Download

set_data(data)[source]

Set the data (a bytestring) to be sent with this request

Response objects in mechanize are seek() able file-like objects that supportsome additional methods, depending on the protocol used for the connection. The documentationbelow is for HTTP(s) responses, as these are the most common.

Additional methods present for HTTP responses:

class mechanize._mechanize.HTTPResponse
code

The HTTP status code

getcode()

Return HTTP status code

geturl()

Return the URL of the resource retrieved, commonly used to determine ifa redirect was followed

get_all_header_names(normalize=True)

Return a list of all headers names. When normalize is True, thecase of the header names is normalized.

get_all_header_values(name, normalize=True)

Return a list of all values for the specified header name (which iscase-insensitive. Since headers in HTTP can be specified multipletimes, the returned value is always a list. Seerfc822.Message.getheaders().

info()

Return the headers of the response as a rfc822.Messageinstance.

__getitem__(header_name)

Return the last HTTP Header matching the specified name as string.mechanize Response object act like dictionaries for convenient accessto header values. For example: response['Date']. You can accessheader values using the header names, case-insensitively. Note thatwhen more than one header with the same name is present, only the valueof the last header is returned, use get_all_header_values() toget the values of all headers.

get(header_name, default=None):

Mechanize Documentation

Return the header value for the specified header_name or default ifthe header is not present. See __getitem__().

class mechanize.Link(base_url, url, text, tag, attrs)[source]

A link in a HTML document

Variables:
  • absolute_url – The absolutized link URL
  • url – The link URL
  • base_url – The base URL against which this link is resolved
  • text – The link text
  • tag – The link tag name
  • attrs – The tag attributes
class mechanize.History[source]

Though this will become public, the implied interface is not yet stable.

mechanize._html.content_parser(data, url=None, response_info=None, transport_encoding=None, default_encoding='utf-8', is_html=True)[source]

Download Mechanize For Ruby Mac Os

Parse data (a bytes object) into an etree representation such asxml.etree.ElementTree or lxml.etree

Parameters:
  • data (bytes) – The data to parse
  • url – The URL of the document being parsed or None
  • response_info – Information about the document(contains all HTTP headers as HTTPMessage)
  • transport_encoding – The character encoding for the document beingparsed as specified in the HTTP headers or None.
  • default_encoding – The character encoding to use if no encodingcould be detected and no transport_encoding is specified
  • is_html – If the document is to be parsed as HTML.