furrycoders / faapi Goto Github PK

View Code? Open in Web Editor NEW

16.0 3.0 6.0 1.43 MB

Python library to implement API-like functionality for the FurAffinity.net website.

License: European Union Public License 1.2

Python 100.00%

furaffinity python3 scraper api

faapi's Introduction

Fur Affinity API

Python library to implement API-like functionality for the Fur Affinity website.

Requirements

Python 3.9+ is necessary to run this library. Poetry is used for packaging and dependency management.

Usage

The API comprises a main class FAAPI, two submission classes Submission and SubmissionPartial, a journal class Journal, and a user class User.

Once FAAPI is initialized, its methods can be used to crawl FA and return parsed objects.

from requests.cookies import RequestsCookieJar
import faapi
import orjson

cookies = RequestsCookieJar()
cookies.set("a", "38565475-3421-3f21-7f63-3d341339737")
cookies.set("b", "356f5962-5a60-0922-1c11-65003b70308")

api = faapi.FAAPI(cookies)
sub, sub_file = api.submission(12345678, get_file=True)

print(sub.id, sub.title, sub.author, f"{len(sub_file) / 1024:02f}KiB")

with open(f"{sub.id}.json", "wb") as f:
    f.write(orjson.dumps(dict(sub)))

with open(sub.file_url.split("/")[-1], "wb") as f:
    f.write(sub_file)

gallery, _ = api.gallery("user_name", 1)
with open("user_name-gallery.json", "wb") as f:
    f.write(orjson.dumps(list(map(dict, gallery))))

robots.txt

At init, the FAAPI object downloads the robots.txt file from FA to determine the Crawl-delay and disallow values set therein. If not set in the robots.txt file, a crawl delay value of 1 second is used.

To respect this value, the default behaviour of the FAAPI object is to wait when a get request is made if the last request was performed more recently then the crawl delay value.

See under FAAPI for more details on this behaviour.

Furthermore, any get operation that points to a disallowed path from robots.txt will raise an exception. This check should not be circumvented, and the developer of this library does not take responsibility for violations of the TOS of Fur Affinity.

Cookies

To access protected pages, cookies from an active session are needed. These cookies can be given to the FAAPI object as a list of dictionaries - each containing a name and a value field -, or as a http.cookiejar.CookieJar object (requests.cookies.RequestsCookieJar and other objects inheriting from CookieJar are also supported). The cookies list should look like the following example:

cookies = [
    {"name": "a", "value": "38565475-3421-3f21-7f63-3d3413397537"},
    {"name": "b", "value": "356f5962-5a60-0922-1c11-65003b703038"},
]

from requests.cookies import RequestsCookieJar

cookies = RequestsCookieJar()
cookies.set("a", "38565475-3421-3f21-7f63-3d3413397537")
cookies.set("b", "356f5962-5a60-0922-1c11-65003b703038")

To access session cookies, consult the manual of the browser used to log in.

Note: it is important to not logout of the session the cookies belong to, otherwise they will no longer work.
Note: as of April 2022 only cookies a and b are needed.

User Agent

FAAPI attaches a User-Agent header to every request. The user agent string is generated at startup in the following format: faapi/{library version} Python/{python version} {system name}/{system release}.

Objects

FAAPI

This is the main object that handles all the calls to scrape pages and get submissions.

It holds 6 different fields:

session: requests.Session The session used for all requests.
robots: urllib.robotparser.RobotFileParser robots.txt handler
user_agent: str user agent used by the session (property, cannot be set)
crawl_delay: float crawl delay from robots.txt (property, cannot be set)
last_get: float time of last get (UNIX time)
raise_for_unauthorized: bool = True if set to True, raises an exception if a request is made and the resulting page is not from a login session
timeout: int | None = None requests timeout in seconds for both page requests (e.g. submissions) and files

Init

__init__(cookies: list[dict[str, str]] | CookieJar, session_class: Type[Session] = Session)

A FAAPI object must be initialised with a cookies object in the format mentioned above in #Cookies.

An optional session_class argument can be given to modify the class used by FAAPI.session. Any class based on requests.Session is accepted.

Methods & Properties

load_cookies(cookies: list[dict[str, str]] | CookieJar)
Load new cookies and create a new session.
Note: This method removes any cookies currently in use, to update/add single cookies access them from the session object.
handle_delay()
Handles the crawl delay as set in the robots.txt
check_path(path: str, *, raise_for_disallowed: bool = False) -> bool
Checks whether a given path is allowed by the robots.txt. If raise_for_disallowed is set to True a DisallowedPath exception is raised on non-allowed paths.
connection_status -> bool
Returns the status of the connection.
login_status -> bool
Returns the login status.
get(path: str, **params) -> requests.Response
This returns a response object containing the result of the get operation on the given URL with the optional **params added to it (url provided is considered as path from 'https://www.furaffinity.net/').
get_parsed(path: str, *, skip_page_check: bool = False, skip_auth_check: bool = False, **params) -> bs4.BeautifulSoup
Similar to get() but returns the parsed HTML from the normal get operation. If the GET request encountered an error, an HTTPError exception is raised. If skip_page_check is set to True, the parsed page is not checked for errors ( e.g. non-existing submission). If skip_auth_check is set to True, the page is not checked for login status.
me() -> User | None
Returns the logged-in user as a User object if the cookies are from a login session.
frontpage() -> list[SubmissionPartial]
Fetch the latest submissions from Fur Affinity's front page.
submission(submission_id: int, get_file: bool = False, *, chunk_size: int = None) -> tuple[Submission, bytes | None]
Given a submission ID, it returns a Submission object containing the various metadata of the submission itself and a bytes object with the submission file if get_file is passed as True. The optional chunk_size argument is used for the request; if left to None or set to 0 the download is performed directly without streaming.
Note: the author UserPartial object of the submission does not contain the join_date field as it does not appear on submission pages.
submission_file(submission: Submission, *, chunk_size: int = None) -> bytes
Given a submission object, it downloads its file and returns it as a bytes object. The optional chunk_size argument is used for the request; if left to None or set to 0 the download is performed directly without streaming.
journal(journal_id: int) -> Journal
Given a journal ID, it returns a Journal object containing the various metadata of the journal.
user(user: str) -> User
Given a username, it returns a User object containing information regarding the user.
gallery(user: str, page: int = 1) -> tuple[list[SubmissionPartial], int | None]
Returns the list of submissions found on a specific gallery page, and the number of the next page. The returned page number is set to None if it is the last page.
scraps(user: str, page: int = 1) -> -> tuple[list[SubmissionPartial], int | None]
Returns the list of submissions found on a specific scraps page, and the number of the next page. The returned page number is set to None if it is the last page.
favorites(user: str, page: str = "") -> tuple[list[SubmissionPartial], str | None]
Downloads a user's favorites page. Because of how favorites pages work on FA, the page argument (and the one returned) are strings. If the favorites page is the last then a None is returned as next page. An empty page value as argument is equivalent to page 1.
Note: favorites page "numbers" do not follow any scheme and are only generated server-side.
journals(user: str, page: int = 1) -> -> tuple[list[JournalPartial], int | None]
Returns the list of submissions found on a specific journals page, and the number of the next page. The returned page number is set to None if it is the last page.
watchlist_to(self, user: str, page:int = 1) -> tuple[list[UserPartial], int | None]
Given a username, returns a list of UserPartial objects for each user that is watching the given user and the next page, if it is not the last, in which case a None is returned.
watchlist_by(self, user: str, page:int = 1) -> tuple[list[UserPartial], int | None]
Given a username, returns a list of UserPartial objects for each user that is watched by the given user and the next page, if it is not the last, in which case a None is returned.

Note: The last page returned by the watchlist_to and watchlist_by may not be correct as Fur Affinity doesn't seem to have a consistent behaviour when rendering the next page button, as such it is safer to use an external algorithm to check whether the method is advancing the page but returning the same/no users.

UserPartial

A stripped-down class that holds basic user information. It is used to hold metadata gathered when parsing a submission, journal, gallery, scraps, etc.

name: str display name with capital letters and extra characters such as "_"
status: str user status (~, !, etc.)
title: str the user title as it appears on their userpage
join_date: datetime the date the user joined (defaults to timestamp 0)
avatar_url: str the URL to the user icon (used only when available)
user_tag: bs4.element.Tag the user element used to parse information (placeholder, UserPartial is filled externally)

UserPartial objects can be directly cast to a dict object and iterated through.

Comparison with UserPartial can be made with either another UserPartial or User object (the URL names are compared), or a string (the URL name is compared to the given string).

Init

__init__(user_tag: bs4.element.Tag = None)

To initialise the object, an optional bs4.element.Tag object is needed containing the user element from a user page or user folder.

If no user_tag is passed then the object fields will remain at their default - empty - value.

Methods

name_url -> str
Property method that returns the URL-safe username
url -> str
Property method that returns the Fur Affinity URL to the user (https://www.furaffinity.net/user/{name_url}).
generate_avatar_url() -> str
Generates the URl for the current user icon.
parse(user_page: bs4.BeautifulSoup = None)
Parses the stored user page for metadata. If user_page is passed, it overwrites the existing user_page value.

User

The main class storing all of a user's metadata.

name: str display name with capital letters and extra characters such as "_"
status: str user status (~, !, etc.)
title: str the user title as it appears on their userpage
join_date: datetime the date the user joined (defaults to timestamp 0)
profile: str profile text in HTML format
profile_bbcode: str profile text in BBCode format
stats: UserStats user statistics sorted in a namedtuple (views, submissions, favorites, comments_earned , comments_made, journals, watched_by, watching)
info: dict[str, str] profile information (e.g. "Accepting Trades", "Accepting Commissions", "Character Species", etc.)
contacts: dict[str, str] contact links (e.g. Twitter, Steam, etc.)
avatar_url: str the URL to the user icon
banner_url: str | None the URL to the user banner (if any is set, otherwise None)
watched: bool True if the user is watched, False otherwise
watched_toggle_link: str | None The link to toggle the watch status (/watch/ or /unwatch/ type link)
blocked: bool True if the user is blocked, False otherwise
blocked_toggle_link: str | None The link to toggle the block status (/block/ or /unblock/ type link)
user_page: bs4.BeautifulSoup the user page used to parse the object fields

User objects can be directly cast to a dict object and iterated through.

Comparison with User can be made with either another User or UserPartial object (the URL names are compared), or a string (the URL name is compared to the given string).

Init

__init__(user_page: bs4.BeautifulSoup = None)

To initialise the object, an optional bs4.BeautifulSoup object is needed containing the parsed HTML of a submission page.

If no user_page is passed then the object fields will remain at their default - empty - value.

Methods

name_url -> str
Property method that returns the URL-safe username
url -> str
Property method that returns the Fur Affinity URL to the user (https://www.furaffinity.net/user/{name_url}).
generate_avatar_url() -> str
Generates the URl for the current user icon.
parse(user_page: bs4.BeautifulSoup = None)
Parses the stored user page for metadata. If user_page is passed, it overwrites the existing user_page value.

JournalPartial

This object contains partial information gathered when parsing a journals folder. It contains the following fields:

id: int journal ID
title: str journal title
date: datetime upload date as a datetime object (defaults to timestamp 0)
author: UserPartial journal author (filled only if the journal is parsed from a bs4.BeautifulSoup page)
stats: JournalStats journal statistics stored in a named tuple (comments (count))
content: str journal content in HTML format
content_bbcode: str journal content in BBCode format
mentions: list[str] the users mentioned in the content (if they were mentioned as links, e.g. :iconusername:, @username, etc.)
journal_tag: bs4.element.Tag the journal tag used to parse the object fields

JournalPartial objects can be directly cast to a dict object or iterated through.

Comparison with JournalPartial can be made with either another JournalPartial or Journal object (the IDs are compared), or an integer (the JournalPartial.id value is compared to the given integer).

Init

__init__(journal_tag: bs4.element.Tag = None)

Journal takes one optional parameters: a journal section tag from a journals page.

If no journal_tag is passed then the object fields will remain at their default - empty - value.

Methods

url -> str
Property method that returns the Fur Affinity URL to the journal (https://www.furaffinity.net/journal/{id}).
parse(journal_item: bs4.element.Tag = None)
Parses the stored journal tag for information. If journal_tag is passed, it overwrites the existing journal_tag value.

Journal

This object contains full information gathered when parsing a journal page. It contains the same fields as JournalPartial with the addition of comments:

id: int journal ID
title: str journal title
date: datetime upload date as a datetime object (defaults to timestamp 0)
author: UserPartial journal author (filled only if the journal is parsed from a bs4.BeautifulSoup page)
stats: JournalStats journal statistics stored in a named tuple (comments (count))
content: str journal content in HTML format
content_bbcode: str journal content in BBCode format
header: str journal header in HTML format (if present)
footer: str journal footer in HTML format (if present)
mentions: list[str] the users mentioned in the content (if they were mentioned as links, e.g. :iconusername:, @username, etc.)
comments: list[Comments] the comments to the journal, organised in a tree structure
journal_page: bs4.BeautifulSoup the journal page used to parse the object fields

Journal objects can be directly cast to a dict object or iterated through.

Comparison with Journal can be made with either another Journal or JournalPartial object (the IDs are compared), or an integer (the Journal.id value is compared to the given integer).

Init

__init__(journal_page: bs4.BeautifulSoup = None)

Journal takes one optional journal page argument.

If no journal_page is passed then the object fields will remain at their default - empty - value.

Methods

url -> str
Property method that returns the Fur Affinity URL to the journal (https://www.furaffinity.net/journal/{id}).
parse(journal_page: bs4.BeautifulSoup = None)
Parses the stored journal tag for information. If journal_tag is passed, it overwrites the existing journal_tag value.

SubmissionPartial

This lightweight submission object is used to contain the information gathered when parsing gallery, scraps, and favorites pages. It contains only the following fields:

id: int submission ID
title: str submission title
author: UserPartial submission author (only the name field is filled)
rating: str submission rating [general, mature, adult]
type: str submission type [text, image, etc...]
thumbnail_url: str the URL to the submission thumbnail
submission_figure: bs4.element.Tag the figure tag used to parse the object fields

SubmissionPartial objects can be directly cast to a dict object or iterated through.

Comparison with Submission can be made with either another SubmissionPartial or Submission object (the IDs are compared), or an integer (the Submission.id value is compared to the given integer).

Init

__init__(submission_figure: bs4.element.Tag = None)

To initialise the object, an optional bs4.element.Tag object is needed containing the parsed HTML of a submission figure tag.

If no submission_figure is passed then the object fields will remain at their default - empty - value.

Methods

url -> str
Property method that returns the Fur Affinity URL to the submission (https://www.furaffinity.net/view/{id}).
parse(submission_figure: bs4.element.Tag = None)
Parses the stored submission figure tag for information. If submission_figure is passed, it overwrites the existing submission_figure value.

Submission

The main class that parses and holds submission metadata.

id: int submission ID
title: str submission title
author: UserPartial submission author (only the name, title, and avatar_url fields are filled)
date: datetime upload date as a datetime object (defaults to timestamp 0)
tags: list[str] tags list
category: str category
species: str species
gender: str gender
rating: str rating
stats: SubmissionStats submission statistics stored in a named tuple (views, comments (count), favorites)
type: str submission type (text, image, etc...)
description: str description in HTML format
description_bbcode: str description in BBCode format
footer: str footer in HTML format
mentions: list[str] the users mentioned in the description (if they were mentioned as links, e.g. :iconusername:, @username, etc.)
folder: str the submission folder (gallery or scraps)
user_folders: list[SubmissionUserFolder] user folders stored in a list of named tuples (name, url, group ( if any))
file_url: str the URL to the submission file
thumbnail_url: str the URL to the submission thumbnail
prev: int the ID of the previous submission (if any)
next: int the ID of the next submission (if any)
favorite: bool True if the submission is a favorite, False otherwise
favorite_toggle_link: str the link to toggle the favorite status (/fav/ or /unfav/ type URL)
comments: list[Comments] the comments to the submission, organised in a tree structure
submission_page: bs4.BeautifulSoup the submission page used to parse the object fields

Submission objects can be directly cast to a dict object and iterated through.

Comparison with Submission can be made with either another Submission or SubmissionPartial object (the IDs are compared), or an integer (the Submission.id value is compared to the given integer).

Init

__init__(submission_page: bs4.BeautifulSoup = None)

To initialise the object, an optional bs4.BeautifulSoup object is needed containing the parsed HTML of a submission page.

If no submission_page is passed then the object fields will remain at their default - empty - value.

Methods

url -> str
Property method that returns the Fur Affinity URL to the submission (https://www.furaffinity.net/view/{id}).
parse(submission_page: bs4.BeautifulSoup = None)
Parses the stored submission page for metadata. If submission_page is passed, it overwrites the existing submission_page value.

Comment

This object class contains comment metadata and is used to build a tree structure with the comments and their replies.

id: int the comment ID
author: UserPartial the user who posted the comment
date: datetime the date the comment was posted
text: str the comment text in HTML format
text_bbcode: str the comment text in BBCode format
replies: list[Comment] list of replies to the comment
reply_to: Comment | int | None the parent comment, if the comment is a reply. The variable type is int only if the comment is parsed outside the parse method of a Submission or Journal (e.g. by creating a new comment with a comment tag), and when iterating over the parent object (to avoid infinite recursion errors), be it Submission , Journal or another Comment.
edited: bool True if the comment was edited, False otherwise
hidden: bool True if the comment was hidden, False otherwise (if the comment was hidden, the author and date fields will default to their empty values)
parent: Submission | Journal | None the Submission or Journal object the comments are connected to
comment_tag: bs4.element.Tag the comment tag used to parse the object fields

Comment objects can be directly cast to a dict object and iterated through.

Comparison with Comment can be made with either another comment (the IDs are compared), or an integer ( the Comment.id value is compared to the given integer).

Note: The __iter__ method of Comment objects automatically removes recursion. The parent variable is set to None and reply_to is set to the comment's ID.
Note: Because each comment contains the parent Submission or Journal object (which contains the comment itself) and the replied comment object, some iterations may cause infinite recursion errors, for example when using the copy.deepcopy function. If such iterations are needed, simply set the parent variable to None and the reply_to variable to None or the comment's ID (this can be done easily after flattening the comments list with faapi.comment.flatten_comments, the comments can then be sorted again with faapi.comment.sort_comments which will also restore the reply_to values to Comment objects).

Init

__init__(self, tag: bs4.element.Tag = None, parent: Submission | Journal = None)

To initialise the object, an optional bs4.element.Tag object is needed containing the comment tag as taken from a submission/journal page.

The optional parent argument sets the parent variable described above.

If no tag is passed then the object fields will remain at their default - empty - value.

Methods

url -> str
Property method that returns the Fur Affinity URL to the comment ( e.g. https://www.furaffinity.net/view/12345678#cid:1234567890). If the parent variable is None, the property returns an empty string.
parse(tag: bs4.element.Tag = None)
Parses the stored tag for metadata. If tag is passed, it overwrites the existing tag value.

Extra Functions

These extra functions can be used to operate on a list of comments. They only alter the order and structure, but they do not touch any of the metadata.

faapi.comment.sort_comments(comments: list[Comment]) -> list[Comment]
Sorts a list of comments into a tree structure. Replies are overwritten.
faapi.comment.flatten_comments(comments: list[Comment]) -> list[Comment]
Flattens a list of comments. Replies are not modified.

Comment Tree Graphs

Using the tree structure generated by the library, it is trivial to build a graph visualisation of the comment tree using the DOT language.

submission, _ = api.submission(12345678)
comments = faapi.comment.flatten_comments(submission.comments)
with open("comments.dot", "w") as f:
    f.write("digraph {\n")
    for comment in [c for c in comments if c.reply_to is None]:
        f.write(f"    parent -> {comment.id}\n")
    for comment in comments:
        for reply in comment.replies:
            f.write(f"    {comment.id} -> {reply.id}\n")
    f.write("}")

digraph {
    parent -> 157990848
    parent -> 157993838
    parent -> 157997294
    157990848 -> 158014077
    158014077 -> 158014816
    158014816 -> 158093180
    158093180 -> 158097024
    157993838 -> 157998464
    157993838 -> 158014126
    157997294 -> 158014135
    158014135 -> 158014470
    158014135 -> 158030074
    158014470 -> 158093185
    158030074 -> 158093199
}

The graph above was generated with quickchart.io

BBCode Conversion

Using the BBCode fields allows to convert between the raw HTMl recovered from Fur Affinity and BBCode tags that follow FA's guidelines. Conversion from HTML to BBCode covers all known tags and preserves all newlines and spacing.

BBCode text can be converted to Fur Affinity's HTMl using the faapi.parse.bbcode_to_html() function. The majority of submissions can be converted back and forth between HTML and BBCode without any information loss, however, the parser rules are still a work in progress and there are many edge cases where unusual text and formatting cause the parser to generate incorrect HTML.

Exceptions

The following are the exceptions explicitly raised by the FAAPI functions. The exceptions deriving from ParsingError are chosen depending on the content of the page. Because Fur Affinity doesn't use HTTP status codes besides 404, the page is checked against a static list of known error messages/page titles in order to determine the specific error to be used. If no match is found, then the ServerError (if the page has the "Server Error" title) or the more general NoticeMessage exceptions are used instead. The actual error message parsed from the page is used as argument for the exceptions, so that it can be analysed when caught.

DisallowedPath(Exception) The path is not allowed by the robots.txt.
Unauthorized(Exception) The user is not logged-in.
ParsingError(Exception) An error occurred while parsing the page.
- NonePage(ParsingError) The parsed page is None.
- NotFound(ParsingError) The resource could not be found (general 404 page or non-existing submission, user, or journal).
- NoTitle(ParsingError) The parsed paged is missing a title.
- DisabledAccount(ParsingError) The resource belongs to a disabled account.
- ServerError(ParsingError) The page contains a server error notice.
- NoticeMessage(ParsingError) A notice of unknown type was found in the page.

Beautiful Soup Warnings

When parsing some pages or converting HTML to BBCode, the Beautiful Soup library may give some warnings, for example MarkupResemblesLocatorWarning. These warnings are left enabled for clarity, but can be disabled manually using the warnings.filterwarnings function.

Contributing

All contributions and suggestions are welcome!

If you have suggestions for fixes or improvements, you can open an issue with your idea, see #Issues for details.

Issues

If any problem is encountered during usage of the program, an issue can be opened on GitHub.

Issues can also be used to suggest improvements and features.

When opening an issue for a problem, please copy the error message and describe the operation in progress when the error occurred.

faapi's People

Contributors

Stargazers

Watchers

Forkers

vvbv warpkaiba featherbutt solipsis-project xraydylan bensuperpc

faapi's Issues

[Bug]: User name not matching

Version

3.9.2

What happened?

I have encountered a strange bug.

A username is not properly returned by the api.

For the username "ashwolves5" a user object can be created with.

user = api.user("ashwolves5")

However, the returned user name from the user object is not matching.
user.name returns only "shwolves5", which obviously causes and error when this name string is uses to creae another user object with the api.

How to reproduce the bug?

(As described)

Relevant log output

No response

[Bug]: requests unable to handle cloudflare check

Version

falocalrepo 4.4.5

What happened?

Cloudflare has enabled new checkers which prevent

How to reproduce the bug?

attempt to run program and returns cloudflare check error

Relevant log output

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/falocalrepo/__main__.py", line 59, in main
    exit(app.main(standalone_mode=False) or 0)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/falocalrepo/console/download.py", line 165, in download_users
    api: FAAPI = open_api(db, ctx)
  File "/usr/local/lib/python3.10/dist-packages/falocalrepo/console/util.py", line 199, in open_api
    if check_login and not api.login_status:
  File "/usr/local/lib/python3.10/dist-packages/faapi/base.py", line 118, in login_status
    return parse_loggedin_user(self.get_parsed("login", skip_auth_check=True)) is not None
  File "/usr/local/lib/python3.10/dist-packages/faapi/base.py", line 146, in get_parsed
    response.raise_for_status()
  File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://www.furaffinity.net/login

[Bug]: Does not work with Fur Affinity from 2022-11-26

The following log is from user page parsing via FALocalRepo.

Traceback (most recent call last):
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/falocalrepo/__main__.py", line 59, in main
    exit(app.main(standalone_mode=False) or 0)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/falocalrepo/console/download.py", line 257, in download_update
    downloader.download_users_update(list(users), list(folders), stop, deactivated, like)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/falocalrepo/downloader.py", line 757, in download_users_update
    self._download_users(users_folders, stop)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/falocalrepo/downloader.py", line 683, in _download_users
    err = self.download_user_page(user, stop == 1)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/falocalrepo/downloader.py", line 613, in download_user_page
    user, err = download_catch(self.api.user, username)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/falocalrepo/downloader.py", line 108, in download_catch
    return func(*args, **kwargs), 0
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/faapi/base.py", line 211, in user
    return User(self.get_parsed(join_url("user", quote(username_url(user)))))
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/faapi/user.py", line 193, in __init__
    self.parse()
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/faapi/user.py", line 234, in parse
    parsed: dict = parse_user_page(self.user_page)
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/faapi/parse.py", line 617, in parse_user_page
    assert tag_status is not None, _raise_exception(ParsingError("Missing name tag"))
  File "/Users/matteoc/Library/Python/3.10/lib/python/site-packages/faapi/exceptions.py", line 56, in _raise_exception
    raise err
faapi.exceptions.ParsingError: Missing name tag

[Bug]: Nothing works due to Cloudflare check

Version

3.11.3

What happened?

Hello, today I tried to use FAAPI and discovered that it stopped working.

Any request I make upon FAAPI ends with a requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: [...]¹ exception. Looking into the response gotten from the server, I can see that the content of the response starts with <!DOCTYPE html><html lang="en-US"><head><title>Just a moment...</title>, which corresponds to a Cloudflare challenge.

I don't know whether this is because FA changed something in their hosting to require all visitors to pass the challenge,² or whether FA is singling my connection for these checks, but in either case, I do think that FAAPI should at least throw a better error so the user can see that a Cloudflare check is preventing FAAPI from working because, as of now, the error thrown is a simple 403, which can mean many things.

How to reproduce the bug?

Try to do anything that sends a request to FA.

Relevant log output

No response

[...] in excerpt signifies omisison. ↩
It seems so, because testing on my two devices, I always get the challenge when trying to access FA. ↩

Fix code scanning alert - Incomplete regular expression for hostnames

Tracking issue for:

https://github.com/FurryCoders/FAAPI/security/code-scanning/1

[Bug]: HTML URL Encoded Usernames

Version

3.9.1

What happened?

I have been using the API for a little project of mine and encountered a problem with a certain username.

In their name ["Basque`"] they use the " ` " character. It gets encoded in the URL with %60.
(Reference for URL encoding https://www.w3schools.com/tags/ref_urlencode.ASP)

The problme is I can't get this user with the API's user() method.
It doesn't throuw an error but insteads retrieves the user named "basque" without the speical character.

How can this be adressed?

How to reproduce the bug?

Call:

api.user("Basque`")

And compare stats to user front page of path "/user/Basque%60".

Relevant log output

No response

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

furrycoders / faapi Goto Github PK

faapi's Introduction

Fur Affinity API

Requirements

Usage

robots.txt

Cookies

User Agent

Objects

FAAPI

Init

Methods & Properties

UserPartial

Init

Methods

User

Init

Methods

JournalPartial

Init

Methods

Journal

Init

Methods

SubmissionPartial

Init

Methods

Submission

Init

Methods

Comment

Init

Methods

Extra Functions

Comment Tree Graphs

BBCode Conversion

Exceptions

Beautiful Soup Warnings

Contributing

Issues

faapi's People

Contributors

Stargazers

Watchers

Forkers

faapi's Issues

Version

What happened?

How to reproduce the bug?

Relevant log output

Version

What happened?

How to reproduce the bug?

Relevant log output

Version

What happened?

How to reproduce the bug?

Relevant log output

Footnotes

Version

What happened?

How to reproduce the bug?

Relevant log output

Recommend Projects

Recommend Topics

Recommend Org