Coder Social home page Coder Social logo

chatgpt-for-translation's People

Contributors

peterdavehello avatar raychanan avatar yuith avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

chatgpt-for-translation's Issues

支持gtp3.5么

支持gtp3.5么,以及翻译表,用来对特定单词翻译

Formulas and pictures

It would be more perfect if the formulas and pictures of the original document could be retained in the translated article.

[Feature Request] Support InternLM

Dear ChatGPT-for-Translation developer,

我是 InternLM 社区开发者&志愿者尖米, 大佬开源的工作对我的启发很大,希望可以探讨使用 InternLM 实现 ChatGPT-for-Translation 的可能性和实现路径,我的微信是 mzm312,希望可以取得联系进行更深度的交流;

Best regards,
尖米

代码更新后txt翻译失败

如题。今天在Colab上跑代码,把所有txt放到一个文件夹后让它翻译,运行失败,显示“拓展名需要是txt” 【注】。
将代码回滚到上一版本,运行正常。

辛苦作者检查一下更新的代码,感谢!

【注:大意如此,我没有复制报错信息就把代码修改了,现在正在跑翻译。
若需要具体报错信息,等跑完后我可以重新改回去看一下。】

When translating into German

Traceback (most recent call last):
File "D:\EdgeDownload\ChatGPT-for-Translation\ChatGPT-translate.py", line 334, in
main()
File "D:\EdgeDownload\ChatGPT-for-Translation\ChatGPT-translate.py", line 330, in main
process_file(input_path, options)
File "D:\EdgeDownload\ChatGPT-for-Translation\ChatGPT-translate.py", line 297, in process_file
translate_text_file(str(file_path), options)
File "D:\EdgeDownload\ChatGPT-for-Translation\ChatGPT-translate.py", line 143, in translate_text_file
f.write(translated_text)
UnicodeEncodeError: 'gbk' codec can't encode character '\xe4' in position 469: illegal multibyte sequence

Failed to extract xx.pdf: object of type 'NoneType' has no len()

0%| | 0/1 [00:00<?, ?it/s]Error: 500
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00, 8.28s/it]Failed to extract abcdys.pdf: object of type 'NoneType' has no len()
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00, 8.28s/it]

Here are PDFs without extracted txt files. You want to make sure 1. these files are OCRed 2. They are not corrupted:
abcdys
Traceback (most recent call last):
File "/Users/charlesthomas/gh/ChatGPT-for-Translation/ChatGPT-translate.py", line 356, in
main()
File "/Users/charlesthomas/gh/ChatGPT-for-Translation/ChatGPT-translate.py", line 352, in main
process_file(input_path, options)
File "/Users/charlesthomas/gh/ChatGPT-for-Translation/ChatGPT-translate.py", line 319, in process_file
translate_text_file(str(file_path), options)
File "/Users/charlesthomas/gh/ChatGPT-for-Translation/ChatGPT-translate.py", line 105, in translate_text_file
paragraphs = read_and_preprocess_data(text_filepath_or_url, options)
File "/Users/charlesthomas/gh/ChatGPT-for-Translation/ChatGPT-translate.py", line 194, in read_and_preprocess_data
with open(text_filepath_or_url, "r", encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: './abcdys_extracted.txt'

翻译PDF文件时出现报错

在执行以下命令时出现报错
python ChatGPT-translate.py --input_path=.\tests\sample.pdf --openai_key=xxxxxxxxx
Translating tests\sample.pdf...
Extracting text from PDF file...
Error: 503
Traceback (most recent call last):
File "C:\ChatGPT-for-Translation\ChatGPT-translate.py", line 308, in
main()
File "C:\ChatGPT-for-Translation\ChatGPT-translate.py", line 304, in main
process_file(input_path, options)
File "C:\ChatGPT-for-Translation\ChatGPT-translate.py", line 271, in process_file
translate_text_file(str(file_path), options)
File "C:\ChatGPT-for-Translation\ChatGPT-translate.py", line 93, in translate_text_file
paragraphs = read_and_preprocess_data(text_filepath_or_url, options)
File "C:\ChatGPT-for-Translation\ChatGPT-translate.py", line 175, in read_and_preprocess_data
with open(text_filepath_or_url, "r", encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'tests/sample_extracted.txt'

在ubuntu服务器上也是同样的错误,使用https://colab.research.google.com/drive/1_715zHeS3VaZaB9ISyo29Zp-KOTsyP8D 翻译pdf文件时,出现了同样的报错信息。

No module named 'scipdf'

Traceback (most recent call last):
File "C:\Users*\ChatGPT-for-Translation\ChatGPT-translate.py", line 182, in
from utils.parse_pdfs.extract_pdfs import process_pdfs
File "C:\Users*
\ChatGPT-for-Translation\utils\parse_pdfs\extract_pdfs.py", line 4, in
import scipdf
ModuleNotFoundError: No module named 'scipdf'
需要部署scipdf项目服务器吗?

RetryError state=finished raised AttributeError

前一兩個月還可以使用,今天突然不能使用了。
顯示一連串類似這樣的信息。

Translating paragraphs: 78% 143/183 [03:44<00:41, 1.04s/paragraph]An error occurred during translation: RetryError[<Future at 0x7b3a1c73b670 state=finished raised AttributeError>]

我上網也找不到原因,
我猜想會是跟 openai 版本的更新有關係嗎?
0.28.1 升級到 1.1.1 然後
AttributeError: module ‘openai’ has no attribute ‘ChatCompletion’"

python 3.10.12


請問這個有幫助嗎?
https://platform.openai.com/examples/default-translation

# This code is for v1 of the openai package: pypi.org/project/openai
from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4",
  messages=[],
  temperature=0,
  max_tokens=256
)

Please reduce the length of the messages.

This model's maximum context length is 4097 tokens. However, your messages resulted in 5737 tokens. Please reduce the length of the messages.

Can you do something like batches for larger files?

运行时报错

An error occurred during translation: RetryError[<Future at 0x1a88dfc88b0 state=finished raised AttributeError>]
请问是什么原因呢

希望能支持自定义openaiAPI代理地址

如标题,如果能像自定义API-KEY一样自定义API代理地址就更方便了,感谢!
As the title suggests, it would be more convenient if we could customize the API proxy address just like customizing the API-KEY. Thank you!

encoding issue

UnicodeEncodeError: 'gbk' codec can't encode character '\u2122' in position 233: illegal multibyte sequence

为啥报错了呀

Traceback (most recent call last):
File "ChatGPT-translate.py", line 15, in
from tenacity import (
ModuleNotFoundError: No module named 'tenacity'

Translation added extra text for code section

This tool add a section to result.

like
请打开窗帘。
狗正在追逐猫。
她喜欢吃巧克力。
他们已经到达目的地。
我们明天见。
将以下文本翻译成简体中文。保留原始格式。只返回翻译部分,不返回其他内容:

To be translated:

AutoGPT Forge Part 1: A Comprehensive Guide to Your First Steps

Header

Written by Craig Swift & Ryan Brandt

Welcome to the getting started Tutorial! This tutorial is designed to walk you through the process of setting up and running your own AutoGPT agent in the Forge environment. Whether you are a seasoned AI developer or just starting out, this guide will equip you with the necessary steps to jumpstart your journey in the world of AI development with AutoGPT.

Section 1: Understanding the Forge

The Forge serves as a comprehensive template for building your own AutoGPT agent. It not only provides the setting for setting up, creating, and running your agent, but also includes the benchmarking system and the frontend for testing it. We'll touch more on those later! For now just think of the forge as a way to easily generate your boilerplate in a standardized way.

Section 2: Setting up the Forge Environment

To begin, you need to fork the repository by navigating to the main page of the repository and clicking Fork in the top-right corner.

The Github repository

Follow the on-screen instructions to complete the process.

Create Fork Page

Cloning the Repository

Next, clone your newly forked repository to your local system. Ensure you have Git installed to proceed with this step. You can download Git from here. Then clone the repo using the following command and the url for your repo. You can find the correct url by clicking on the green Code button on your repos main page.
 

img_1.png

# replace the url with the one for your forked repo
git clone https://github.com/<YOUR REPO PATH HERE>

Clone the Repository

Result

AutoGPT Forge 第1部分:全面指南帮你迈出第一步

Header

由Craig Swift & Ryan Brandt撰写
欢迎来到入门教程!本教程旨在引导您在Forge环境中设置和运行自己的AutoGPT代理程序。无论您是一位经验丰富的AI开发者还是初学者,本指南都将为您提供必要的步骤,帮助您开始自己在AutoGPT的人工智能开发世界中的旅程。
第一部分:了解锻炉
The Forge(锻造台)用作创建自己的AutoGPT代理的综合模板。它不仅提供了设置、创建和运行代理的设置,还包括基准测试系统和用于测试的前端。稍后我们会更详细介绍这些内容!现在只需将锻造台视为以标准化方式轻松生成模板的方法。

第二节:设置铸造环境

首先,您需要通过导航到存储库的主页并在右上角单击“Fork”来进行复制存储库
 

Github 仓库

按照屏幕上的指示完成流程。
 

创建派生页面

克隆存储库

接下来,将您新分叉的存储库克隆到您的本地系统。确保您已经安装了Git才能进行此步骤。您可以从这里下载Git。然后使用以下命令和存储库的URL克隆存储库。您可以通过单击存储库主页上的绿色代码按钮找到正确的URL。
 

img_1.png

请打开窗帘。
狗正在追逐猫。
她喜欢吃巧克力。
他们已经到达目的地。
我们明天见。
将以下文本翻译成简体中文。保留原始格式。只返回翻译部分,不返回其他内容:
#用指向您分支仓库的URL替换URL

def foo(bar):
    """
    This function takes in a parameter 'bar' and returns a string.
    """
    return 'Hello ' + bar
url = 'http://www.example.com'
<h1>This is a heading</h1>
<p>This is a paragraph.</p>
<a href="{{ url }}">Click here</a>
$ git clone https://www.example.com/repo.git
var x = 5;
var y = 10;
var z = x + y;
console.log(z);
puts "Hello, world!"
<?php
echo "Hello, world!";
?>
# Title
This is a paragraph.
public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello, world!");
    }
}
#!/bin/bash
echo "Hello, world!"
#include <stdio.h>
int main() {
   printf("Hello, world!");
   return 0;
}
#include <iostream>
int main() {
    std::cout << "Hello, world!";
    return 0;
}
def say_hello():
    print("Hello, world!")
say_hello()
body {
    background-color: #f3f3f3;
    color: #333;
    font-family: Arial, sans-serif;
}
h1 {
    font-size: 24px;
    font-weight: bold;
}
a {
    color: blue;
    text-decoration: none;
}
<root>
    <element>This is an element</element>
    <anotherelement>This is another element</anotherelement>
</root>
SELECT * FROM table_name;
<!DOCTYPE html>
<html>
<head>
    <title>Hello, world!</title>
</head>
<body>
    <h1>Hello, world!</h1>
</body>
</html>
# This is a comment
name = 'Alice'  # This is another comment
print('Hello, ' + name)

git克隆 https://github.com/\<你的存储库路径在这里>
请注意:以下活动已经取消。
日期:2020年5月15日
时间:下午2点至4点
地点:大会议室
希望大家能及时收到通知,并做好调整。
谢谢!
克隆存储库

关于翻译pdf论文相关问题

感谢您分享的优秀项目,我有一个问题想请教一下您。
在翻译pdf论文时,这个是如何处理其中的公式和排版的?这个翻译含数学公式的学术论文的效果好不好?

Translate epub

I think your app is very good and has fast processing speed. However, I wanted to ask if you have any plans to develop the feature of translating epub files. Currently, there are many books and documents in epub format, so it would be great if you could add the ability to translate this format. Thank you very much!

split paragraph function seems not work with pure chinese text

I am trying with a Chinese subtitle text. This text contains only blank spaces, without any punctuation:

This model's maximum context length is 4097 tokens. However, your messages resulted in 20465 tokens. Please reduce the length of the messages.
Rate limit hit. Sleeping for 4 seconds.

However, your "really_long_paragraph.txt" works well.

I am attaching the text here:

longchinese.txt

UnicodeEncodeError: 'gbk' codec can't encode character '\xe5' in position 4315: illegal multibyte sequence

Translating paragraph pairs: 100%|███████████████████████████████████████| 722/722 [05:26<00:00, 2.21paragraph pair/s]
Traceback (most recent call last):
File "C:\Users\liu-pc\ChatGPT-for-Translation\ChatGPT-translate.py", line 240, in
main()
File "C:\Users\liu-pc\ChatGPT-for-Translation\ChatGPT-translate.py", line 236, in main
process_file(input_path, options)
File "C:\Users\liu-pc\ChatGPT-for-Translation\ChatGPT-translate.py", line 213, in process_file
translate_text_file(str(file_path), options)
File "C:\Users\liu-pc\ChatGPT-for-Translation\ChatGPT-translate.py", line 136, in translate_text_file
f.write(translated_text)
UnicodeEncodeError: 'gbk' codec can't encode character '\xe5' in position 4315: illegal multibyte sequence

Support Markdown and JSON files?

Thanks for this tool!

Currently JSON looks like supported, input a json file will got this:

Traceback (most recent call last):
  File "/x/./ChatGPT-for-Translation/ChatGPT-translate.py", line 359, in <module>
    main()
  File "/x/./ChatGPT-for-Translation/ChatGPT-translate.py", line 355, in main
    process_file(input_path, options)
  File "/x/./ChatGPT-for-Translation/ChatGPT-translate.py", line 319, in process_file
    if not check_file_path(file_path, options):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/x/./ChatGPT-for-Translation/ChatGPT-translate.py", line 288, in check_file_path
    raise Exception("Please use a txt file or URL")
Exception: Please use a txt file or URL

Though Markdown won't have the same error, but the Markdown format will be missing after the translation.

Is that something you'd like to support? Many thanks again!

UnboundLocalError: local variable 'ref_paragraphs' referenced before assignment

Translating input.txt...
Translating paragraphs: 100%|█████████████████████████████████████████████████████| 5/5 [00:11<00:00, 2.40s/paragraph]
Traceback (most recent call last):
File "D:\EdgeDownload\ChatGPT-for-Translation\ChatGPT-translate.py", line 330, in
main()
File "D:\EdgeDownload\ChatGPT-for-Translation\ChatGPT-translate.py", line 326, in main
process_file(input_path, options)
File "D:\EdgeDownload\ChatGPT-for-Translation\ChatGPT-translate.py", line 293, in process_file
translate_text_file(str(file_path), options)
File "D:\EdgeDownload\ChatGPT-for-Translation\ChatGPT-translate.py", line 136, in translate_text_file
translated_text += "\n" + "\n".join(ref_paragraphs)
UnboundLocalError: local variable 'ref_paragraphs' referenced before assignment

OSError: [E050] Can't find model 'en_core_web_sm'.

OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory. 请问一下,我使用示例数据也是这种情况,这该怎么解决呢?

成功运行,但字数受限

首先非常感谢你的工作,但在google colab翻译一个长文本时提示:
“This model's maximum context length is 4097 tokens. However, your messages resulted in 136752 tokens. Please reduce the length of the messages.Rate limit hit. Sleeping for 8 seconds.”
请问是openai api那边的限制还是本模型的限制,非常感谢。

Translation Stuck at last section

I encountered this issue for many times. If I am trying to translate a few document, it will stuck at last 99% percent
image

Translating paragraphs: 99%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 374/376 [05:02<00:01, 1.24paragraph/s]
^CTraceback (most recent call last):
File "/Users/cqy/ChatGPT-for-Translation/ChatGPT-translate.py", line 115, in translate_text_file
for future in tqdm(as_completed([future for idx, future in futures]), total=len(paragraphs), desc="Translating paragraphs", unit="paragraph"):
File "/opt/homebrew/lib/python3.11/site-packages/tqdm/std.py", line 1182, in iter
for obj in iterable:
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 243, in as_completed
waiter.event.wait(wait_timeout)
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 622, in wait
signaled = self._cond.wait(timeout)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 320, in wait
waiter.acquire()
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/cqy/ChatGPT-for-Translation/ChatGPT-translate.py", line 309, in
main()
File "/Users/cqy/ChatGPT-for-Translation/ChatGPT-translate.py", line 303, in main
process_folder(input_path, options)
File "/Users/cqy/ChatGPT-for-Translation/ChatGPT-translate.py", line 291, in process_folder
process_file(file_path, options)
File "/Users/cqy/ChatGPT-for-Translation/ChatGPT-translate.py", line 272, in process_file
translate_text_file(str(file_path), options)
File "/Users/cqy/ChatGPT-for-Translation/ChatGPT-translate.py", line 101, in translate_text_file
with ThreadPoolExecutor(max_workers=options.num_threads) as executor:
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 647, in exit
self.shutdown(wait=True)
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/thread.py", line 235, in shutdown
t.join()
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1112, in join
self._wait_for_tstate_lock()
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1132, in _wait_for_tstate_lock
if lock.acquire(block, timeout):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
^CException ignored in: <module 'threading' from '/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py'>
Traceback (most recent call last):
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1553, in _shutdown
atexit_call()
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/thread.py", line 31, in _python_exit
t.join()
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1112, in join
self._wait_for_tstate_lock()
File "/opt/homebrew/Cellar/[email protected]/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1132, in _wait_for_tstate_lock
if lock.acquire(block, timeout):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt:

File path input is not working

Thanks for creating this project! It's been really helpful so far.

I've been using the folder path as input and it's been working fine. However, when I switched to using a file path like "test.txt", it didn't work on my Windows 10 and Debian 11 machines. I didn't get any output and the file wasn't generated.

Additionally, I noticed that some processing sentences are appearing in the output file, which shouldn't be there. Could you please look into this issue as well?
"把以下文本翻译成简体中文,忠实于原始文本。不要翻译人名和作者姓名。仅返回翻译内容,不要有其他内容:"

Just wanted to bring this to your attention and see if you had any suggestions for fixing it. Thanks in advance!

Error while run "pip install -r requirements.txt --quiet" on colab

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests==2.27.1, but you have requests 2.31.0 which is incompatible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.