Coder Social home page Coder Social logo

estebanpdl / telegram-tracker Goto Github PK

View Code? Open in Web Editor NEW
294.0 7.0 59.0 23 KB

The package connects to Telegram's API to generate JSON files containing data for channels, including information and posts. It allows you to search for specific channels or a set of channels provided in a text file, with one channel per line.

Python 100.00%
open-source osint osint-python python python3 telegram-api

telegram-tracker's Introduction

Telegram-API: a Python-based open-source tool for Telegram


GitHub forks GitHub stars License Open Source Made-with-python Twitter estebanpdl


Overview

This tool connects to Telegram's API. It generates JSON files containing channel's data, including channel's information and posts. You can search for a specific channel, or a set of channels provided in a text file (one channel per line.)

Files are saved by default in a folder called output/data. These folders are created by the script. You can also give a specific output directory to store collected data.

Software required

Python required libraries

Installing

  • Via git clone
git clone https://github.com/estebanpdl/telegram-api.git

This will create a directory called telegram-tracker which contains the Python scripts. Cloning allows you to easily upgrade and switch between available releases.

  • From the github download button

Download the ZIP file from github and use your favorite zip utility to unpack the file telegram-tracker.zip on your preferred location.

After cloning or downloding the repository, install the libraries from requirements.txt.

pip install -r requirements.txt

or

pip3 install -r requirements.txt

Once you obtain an API ID and API hash on my.telegram.org, populate the config/config.ini file with the described values.

[Telegram API credentials]
api_id = api_id
api_hash = api_hash
phone = phone

Note: Your phone must be included to authenticate for the first time. Use the format +<code><number> (e.g., +19876543210). Telegram API will send you a code via Telegram app that you will need to include.



Example usage

main.py

This Python script will connect to Telegram's API and handle your API request.

Options

  • --telegram-channel Specifies Telegram Channel to download data from.
  • --batch-file File containing Telegram Channels to download data from, one channel per line.
  • --limit-download-to-channel-metadata Will collect channels metadata only, not channel's messages. (default = False)
  • --output, -o Specifies a folder to save collected data. If not given, script will generate a default folder called ./output/data
  • --min-id Specifies the offset id. This will update Telegram data with new posts.

Structure of output data

├──🗂 output
|   └──🗂 data
|   	└──🗂 <channel_name>
|   		└──<channel_name>.json
|   		└──<channel_name>_messages.json
|   	└──chats.txt // TM channels, groups, or users' IDs found in data.
|   	└──collected_chats.csv // TM channels or groups found in data (e.g., forwards)
|   	└──collected_chats.xlsx // TM channels or groups found in data (e.g., forwards)
|   	└──counter.csv // TM channels, groups or users found in data (e.g., forwards)
|   	└──user_exceptions.txt // From collected_chats, these are mostly TM users' which 
|									metadata was not possible to retrieve via the API
|   	└──msgs_dataset.csv // Posts and messages from the requested channels

Examples


Basic request

python main.py --telegram-channel channelname`

Expected output

  • Files of collected channels:
    • chats.txt
    • collected_chats.csv
    • user_exceptions.txt
    • counter.csv
  • A new folder: <channel_name> containing
    • A JSON file containing channel's profile metadata
    • A JSON file containing posts from the requested channel

Request using a text file containing a set of channels

python main.py --batch-file './path/to/channels_text_file.txt'

Expected output

  • Files of collected channels:
    • chats.txt
    • collected_chats.csv
    • user_exceptions.txt
    • counter.csv
  • New folders - based on the number of requested channels: <channel_name> containing
    • A JSON file containing channel's profile metadata
    • A JSON file containing posts from the requested channel

These examples will retrieve all posts available through the API from the requested channel. If you want to collect channel's information only, without posts, you can run:


Limit download to channel's metadata only

python main.py --telegram-channel channelname --limit-download-to-channel-metadata

or, using a set of telegram channels via a text file:

python main.py --batch-file './path/to/channels_text_file.txt' --limit-download-to-channel-metadata

Updating channel's data

If you want to collect new messages from one channel, you need to identify the message ID from the last post. Once you identify the id, run:

python main.py --telegram-channel channelname --min-id 12345

Expected output

  • Files of collected channels:
    • chats.txt
    • collected_chats.csv
    • user_exceptions.txt
    • counter.csv
  • A new folder: <channel_name> containing
    • A JSON file containing channel's profile metadata
    • A JSON file containing posts from the requested channel

Specify output folder

The script allows you to specify a specific output directory to save collected data. The sxcript will create those folders in case do not exist.

python main.py --telegram-channel channelname --output ./path/to/chosen/directory`

The expected output is the same a described above but data will be save using the chosen directory.



build-datasets.py

This Python script reads the collected files and creates a new dataset containing messages from the requested channels. By default, the created dataset will be located in the output folder.

If you provided a specific directory to save collected data, you need to provide the same path to use this script.

Options

  • --data-path Path were data is located. Will use ./output/data if not given.

If a specific directory was not provided in main.py, run:

python build-datasets.py

If you provided a specific directory using the option --output in main.py, run:

python build-datasets.py --data-path ./path/to/chosen/directory

These option will create the above-mentioned dataset: msgs_dataset.csv, a file containing posts and messages from the requested channels.



channels-to-network.py

This Python script builds a network graph. By default, the file will be located in the output folder. The script also saves a preliminary graph: network.png using the modules matplotlib, networkx, and python-louvain, which implements community detection. You can import the GEFX Graph File using different softwares, including Gephi.

Options

  • --data-path Path were data is located. Will use ./output/data if not given.

If a specific directory was not provided in main.py, run:

python channels-to-network.py

If you provided a specific directory using the option --output in main.py, run:

python channels-to-network.py --data-path ./path/to/chosen/directory

telegram-tracker's People

Contributors

estebanpdl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

telegram-tracker's Issues

696

Можешь помочь???
У меня есть код
для телеграмм бота
Куда его на сайте на каком прописывать
и что делать???

Setup.py?

Setup.py is not found, keep getting error while running pip install -r requirements.txt that says python setup.py bdist_wheel did not run successfully. Using Windows Powershell as command line, virtual env already setup. Any input would be greatly appreciated, thanks.

[WinError 10054]

Hello. I have followed the steps and I can't get it to work. It gives me this error:

Init program at Wed Sep 27 09:34:04 2023

Attempt 1 at connecting failed: ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
Attempt 2 at connecting failed: ConnectionAbortedError: [WinError 10053] Se ha anulado una conexión establecida por el software en su equipo host
Server closed the connection: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
Connection error 3 during auth_key gen: ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
Attempt 4 at connecting failed: ConnectionAbortedError: [WinError 10053] Se ha anulado una conexión establecida por el software en su equipo host
Attempt 5 at connecting failed: ConnectionAbortedError: [WinError 10053] Se ha anulado una conexión establecida por el software en su equipo host
Attempt 6 at connecting failed: ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
Traceback (most recent call last):
File "main.py", line 119, in
client = loop.run_until_complete(
File "C:\Users\barri\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 616, in run_until_complete
return future.result()
File "F:\Telegram\telegram-tracker-main\api_init_.py", line 26, in get_connection
await client.connect()
File "C:\Users\barri\AppData\Local\Programs\Python\Python38\lib\site-packages\telethon\client\telegrambaseclient.py", line 544, in connect
if not await self._sender.connect(self._connection(
File "C:\Users\barri\AppData\Local\Programs\Python\Python38\lib\site-packages\telethon\network\mtprotosender.py", line 134, in connect
await self._connect()
File "C:\Users\barri\AppData\Local\Programs\Python\Python38\lib\site-packages\telethon\network\mtprotosender.py", line 260, in _connect
raise ConnectionError('Connection to Telegram failed {} time(s)'.format(self._retries))
ConnectionError: Connection to Telegram failed 5 time(s)

Any ideas?

Channels with spaces in their name?

I'm trying to connect to channels with spaces in their name and constantly get errors.

$ python ./main.py --telegram-channel Channel\ Name

Init program at Tue Nov  8 19:54:38 2022


> Authorized!

> Collecting data from Telegram Channel -> Channel Name
> ...

Traceback (most recent call last):
  File "/home/xxx/telegram/telegram-api/./main.py", line 161, in <module>
    entity_attrs = loop.run_until_complete(
  File "/home/xxx/miniconda3/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/xxx/telegram/telegram-api/api/__init__.py", line 57, in get_entity_attrs
    return await client.get_entity(source)
  File "/home/xxx/miniconda3/lib/python3.9/site-packages/telethon/client/users.py", line 335, in get_entity
    result.append(await self._get_entity_from_string(x))
  File "/home/xxx/miniconda3/lib/python3.9/site-packages/telethon/client/users.py", line 574, in _get_entity_from_string
    raise ValueError(
ValueError: Cannot find any entity corresponding to "Channel Name"

I've tried with quotes and with \, same error.

I also created a text file with the channel names using --batch-file, and it fails in the same manner.

Key error 'Telegram API credentials'

Hi

I've set up the tool via gitclone and entered my credentials into the config file but I am receiving this error when I try to test run. Any advice on how to fix? I have entered my phone number in the required format and do not receive a text when running the script.

C:\Users\user\Desktop\Telegram api>python "C:\Users\user\Desktop\Telegram api\telegram-api\main.py" --telegram-channel omonmoscow
C:\Users\user\anaconda3\lib\site-packages\pandas\core\computation\expressions.py:20: UserWarning: Pandas requires version '2.7.3' or newer of 'numexpr' (version '2.7.1' currently installed).
from pandas.core.computation.check import NUMEXPR_INSTALLED
Traceback (most recent call last):
File "C:\Users\user\Desktop\Telegram api\telegram-api\main.py", line 71, in
config_attrs = get_config_attrs()
File "C:\Users\user\Desktop\Telegram api\telegram-api\utils_init_.py", line 33, in get_config_attrs
attrs = config['Telegram API credentials']
File "C:\Users\user\anaconda3\lib\configparser.py", line 960, in getitem
raise KeyError(key)
KeyError: 'Telegram API credentials'

Can it search on it's own

Hello esteban, I've been looking at your tracker project and was wondering if this is able to search on it's own for telegram channel and save it's data?

Cannot cast InputPeerChat to any kind of InputChannel

OS : Ubuntu 21
Python : 3.10.4

  • I'm using : python main.py --batch-file 'channels_text_file.txt' as source of channels, with one channel per line.
  • I'm able to see the JSON/CSV files getting created

After a successful installation of the requirements, I'm able to run it
image

but after a while, a few minutes, it crashes with this error :

`Traceback (most recent call last):
File "/home/peer/tg-api/main.py", line 176, in
counter = write_collected_chats(
File "/home/peer/tg-api/utils/init.py", line 142, in write_collected_chats
ch['participants_count'] = process_participants_count(client, ch_id)
File "/home/peer/tg-api/utils/init.py", line 73, in process_participants_count
channel_request = loop.run_until_complete(
File "/home/peer/.pyenv/versions/3.10.4/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
return future.result()
File "/home/peer/tg-api/api/init.py", line 85, in full_channel_req
return await client(
File "/home/peer/.pyenv/versions/3.10.4/lib/python3.10/site-packages/telethon/client/users.py", line 30, in call
return await self._call(self._sender, request, ordered=ordered)
File "/home/peer/.pyenv/versions/3.10.4/lib/python3.10/site-packages/telethon/client/users.py", line 39, in _call
await r.resolve(self, utils)
File "/home/peer/.pyenv/versions/3.10.4/lib/python3.10/site-packages/telethon/tl/functions/channels.py", line 720, in resolve
self.channel = utils.get_input_channel(await client.get_input_entity(self.channel))
File "/home/peer/.pyenv/versions/3.10.4/lib/python3.10/site-packages/telethon/utils.py", line 263, in get_input_channel
_raise_cast_fail(entity, 'InputChannel')
File "/home/peer/.pyenv/versions/3.10.4/lib/python3.10/site-packages/telethon/utils.py", line 138, in _raise_cast_fail
raise TypeError('Cannot cast {} to any kind of {}.'.format(

TypeError: Cannot cast InputPeerChat to any kind of InputChannel.`

I'm not a Python dev in itself so any hint to help me in the right direction to debug this would be very appreciated,
I'm not 100% but this error could entirely be only on my end.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.