tlc121 / fsfont Goto Github PK

Official PaddlePaddle Implementation of Few-Shot Font Generation by Learning Fine-Grained Local Styles (FsFont)

License: Other

Python 99.33% Shell 0.67%

cvpr2022 deep-learning few-shot-learning font-generation generative-model paddlepaddle fsfont

fsfont's Introduction

FsFont: Few-Shot Font Generation by Learning Fine-Grained Local Styles (CVPR2022)

This is the official paddlepaddle implementation for "FsFont: Few-Shot Font Generation by Learning Fine-Grained Local Styles" by Licheng Tang, Yiyang Cai, Jiaming Liu, Zhibin Hong, Mingming Gong, Minhu Fan, Junyu Han, Jingtuo Liu, Errui Ding and Jingdong Wang.
Paper Link: arxiv | Bibtex

Dependencies

PaddlePaddle == 2.3.2
torch >= 1.10.0 (for grid plot)
torchvision >= 0.11.0 (for grid plot)
sconf >= 0.2.3
lmdb >= 1.2.1

How to start

Data Preparation

1. Images & Characters

Draw all font images first including training fonts, validation fonts and your own content font. Organize directories structure as below

Font Directory
| --| font1
| --| font2
| --| ch1.png
| --| ch2.png
| --| ...
| --| ...

You also need to split all characters into train characters and val characters with unicode format and save them into json files, you can convert the utf8 format to unicode by using hex(ord(ch))[2:].upper():

train_unis: ["4E00", "4E01", ...]
val_unis: ['8E21', ...]

2. Content-Reference mapping

It is a dict you have to create by yourself before start, you can find the full CR-mapping algorithm in appendix and using the decompose dictionary of LFFont. Elements in dict also need to be converted into unicode format. Please assure keys of CR mapping contain both of train_unis and val_unis.

{content1: [ref1, ref2, ref3, ...], content2: [ref1, ref2, ref3, ...], ...}

example(in utf-8 format):

{连: [转, 还], 剧: [呢, 别, 卖], 愫: [累, 快, 请], ...}

PS: You have to make sure the number of refereces of all contents be same for purpose of batch training.
e.g. 连: [转, 转, 还]

3. Run scripts

python3 ./build_dataset/build_meta4train.py 
--saving_dir ./results/your_task_name/ 
--content_font path\to\content 
--train_font_dir path\to\training_font 
--val_font_dir path\to\validation_font 
--seen_unis_file path\to\train_unis.json 
--unseen_unis_file path\to\val_unis.json

Training

You can modify the configuration in the file cfgs/custom.yaml

1. keys

work_dir: the root directory for saved results. (keep same with the saving_dir above)
data_path: path to data lmdb environment. (saving_dir/lmdb)
data_meta: path to train meta file. (saving_dir/meta)
content_font: the name of font you want to use as source font.
content_reference_json: the json file which stores content_referece mapping.
other values are hyperparameters for training.

2. Run scripts

python3 train.py 
    task_name
    cfgs/custom.yaml
    --resume \path\to\your\pretrain_model.pdparams

Test

1. Run scripts

python3 inference.py ./cfgs/custom.yaml 
--weight path\to\saved_weight.pdparams
--content_font path\to\content 
--img_path path\to\reference 
--saving_root path\to\saving_folder

Acknowledgements

Our code is modified based on the LFFont.

Bibtex

@InProceedings{Tang_2022_CVPR, 
    author    = {Tang, Licheng and Cai, Yiyang and Liu, Jiaming and Hong, Zhibin and Gong, Mingming and Fan, Minhu and Han, Junyu and Liu, Jingtuo and Ding, Errui and Wang, Jingdong}, 
    title     = {Few-Shot Font Generation by Learning Fine-Grained Local Styles}, 
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    month     = {June}, 
    year      = {2022}, 
    pages     = {7895-7904} 
}

Contact

For any questions, please file an issue or contact me [email protected]

fsfont's People

Contributors

Stargazers

Watchers

Forkers

ecnuycxie bingranhu lijiaxinxin adolfhill sunyuanrui mmasterliu smthemex aniketgurav

fsfont's Issues

请教一下如何通过查看loss曲线图来判断预测效果的好坏？

可以放一个数据集的demo吗？

How to Inference?

What should be put for arguments --content_font and --img_path?
Is this content_font the same directory as the content_font in build_datasets/build_trainset.sh? Should the directory contains the training content words images?
What should be contained in img_path? Am I correct to put some words apart from those in content_font directory with the same font with the images in content_font?

The "charlist" of the content font is empty and also "gen_unis" is empty when I run the inference.sh. Please help.

数据集

你好，请问可以提供一个数据集来跑demo吗？

Influence of reference character and component coverage on results

Thanks for open-sourcing this inspiring work!

I am now doing some comparative experiments for this work. The visualization of the training process seems to work fine. However, the output of the test is not so good. I guess the reason may be that the components of reference characters do not cover the test set well. (since the number of all our reference characters is 16)

So I wonder if the choice of reference characters and the coverage of the components have a big impact on the results?

数据集

您好，我现在对您的工作非常感兴趣，但是我们没有找到合适的数据集，请问可以分享下您的数据集吗？

数据集的相关问题

1.Unseen Fonts是代表训练时没有见过的字体种类吗？如果是的话，训练时没有见过的字体种类，评估时如何生成这种字体种类的字符呀
2.https://raw.githubusercontent.com/cjkvi/cjkvi-ids/master/ids.txt这个分解表是借鉴他人的吗？和lf-font的分解表有何区别呀？不同的分解表影响实验的效果吗？
3.https://chanind.github.io/hanzi-writer-data/ 是用来制作 Kaiti Line Font字体的吗，使用Kaiti Line Font作为内容字体为啥是一个公平的比较吗？
4.本文是无监督的吗？few-shot都是无监督的吗？

对比实验

作者您好，感谢您出色的工作，对比实验的FUNIT、AGIS-NET代码是否可以分享一下，我用自己的字体数据集训练了FUNIT，但是效果远没有达到预期，不知道是哪出了问题，是否可以分享一下对比实验的代码，我的数据集是80*80的

The model cannot converge

I tried small epoch(1500) to see the performance. but the loss jumps around 0.04~0.05 and i didnt see it get smaller, is this normal?
heres my dataset and hyperparams, any advice?
how was ur training process? is it okay to share ur logs maybe?

About the training process

Hi! Thank you for sharing you work. I was wondering if you could provide an example of the training log, eg. losses and generated images throughout the training process. I had trouble training with my own data and it would be really helpful for me. Thanks in advance!

关于数据集和训练

优秀的工作！！
我有一些问题想请教一下。
请问大概在迭代多少次时开始有效果（可以看出字形）呢？

训练过程中评估的问题

为什么会出现这种情况的呢？是因为我准备映射文件问题嘛，我目前使用的项目中自带的映射文件cr_mapping.json。

请教一下，使用本项目如何训练字母和数字？

作者能不能提供一个预训练好的pretrain_model.pdparams？感谢了

Some questions about Ablation studies in the paper.

Hi! Thanks for sharing the great work!
I have some questions about FsFont.
Compare Ablation part and Experimental results, the model without any new module (the last row in Ablation studies table, which can I say it is just a GAN with reference encoder & content encoder ) is still has quite good performance even compared to LF-Font or MX-Font. Did I miss any details?

About content font

请问下论文中的content font是采用的什么字体？

About inference process.

在推理阶段需要用到reference set中所有(100个)字符的style images？

May I train multiple fonts in one turn?

关于训练数据的一些问题

reference set

你好！
非常感谢你公布这项工作的代码。
目前我有一些关于reference set 的问题，希望能够得到你的解答。
1）在论文中提到使用100个字组成reference set，这100个字包含在全部的3396中，还是包含在常见的20K个字中？（正文中的表述和补充材料中的表述似乎不一致）

2) reference set 中的字是否需要包括论文中提及的374个组件（我在自定义reference set会出现达到了100个字，但是未包括所有组件的情况）

代码训练时停在那不动

您好，我训练代码时停在这不动了，是我哪有问题吗？

关于训练时遇到的问题

您好，想请教一下关于训练的问题。

根据论文所说，您是训练至 50w iterations，但我目前遇到一个问题就是我大概在 15w 前 L1 都是 0.03x 以下，val 时的结果图还没那么好但有大致的轮廓，但越往后训练越不好，L1 升至 0.05 左右，并且 val 时生成的图形有残缺或是少了某个部首。

因为我在训练 LF-Font 时也有遇到类似问题，不知道是不是我的资料集比较难训练，还是有地方忽略了，但我是根据您 README 上的教学和预设的 hyper-parameters 去训练的。

希望您能拨空解惑，非常感谢！

About cr_mapping

I have used my own dataset, but it didn't work. So I guess there's something wrong with the file in meta. May I ask how the file in meta be generated, especially about cr_mapping. Was it generated manually?

What does step_size means?

What is the use of step_size in cfgs/custom.yaml?

预测中的问题

请教一下，为什么我在使用训练10万次的模型进行预测时生成的图像会跟训练集中某个图像集是一模一样的呢？这个是什么原因造成的呢？感谢

apex

您好，没有安装apex影响整体的运行效果吗，如果安装需要安装apex哪个版本呢，对应的python和torch版本分别是多少呢

数据集

您好！
非常感谢您能够公开这项工作的代码，你的这篇论文观点非常新颖，对我的帮助很大！现在关于数据集我有一些问题：请问在前期准备数据集的过程中，除了需要自己制作两个json文件之外，还需要准备其他的文件吗？关于content-reference mapping那部分我不是很明白，这部分还需要额外制作一个dict吗？还是这个过程已经包括在了./build_dataset/build_meta4train.py这部分代码中？
希望您能够解答，不胜感谢！

训练代码报错cfg = Config(*args.config_paths, default="cfgs/defaults.yaml", colorize_modified_item=True)

在config这个函数中它一直显示key是我config_paths的头一个字母，并且一直报错，望大牛解惑

有没有保姆教程啊

inference.py中的预测出来的结果图跟--img_path是类似图的？

为什么会跟预测图接近的呢？

训练过程中遇到的一些问题

我在训练过程中遇到如下问题：

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 536, in _thread_loop
    batch = self._get_data()
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 674, in _get_data
    batch.reraise()
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/worker.py", line 172, in reraise
    raise self.exc_type(msg)
TypeError: DataLoader worker(1) caught TypeError with message:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/worker.py", line 339, in _worker_loop
    batch = fetcher.fetch(indices)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/fetcher.py", line 125, in fetch
    data.append(self.dataset[idx])
  File "/app/datasets/dataset_transformer.py", line 82, in __getitem__
    for uni in trg_unis]).unsqueeze_(1)
  File "/app/datasets/dataset_transformer.py", line 82, in <listcomp>
    for uni in trg_unis]).unsqueeze_(1)
  File "train.py", line 95, in <lambda>
    env_get = lambda env, x, y, transform: transform(read_data_from_lmdb(env, f'{x}_{y}')['img'])
TypeError: 'NoneType' object is not subscriptable

我想知道是为什么呢？

Content-Reference mapping

您好，我看过了对应的论文以及部分代码，觉得这是非常精彩的一项工作，但是注意到关键的 meta/cr_mapping.json 文件似乎并没有完整放出。如果方便的话能否分享一下这个文件，保证仅用于学术研究。

或者可否指点一下如何生成它？以下是我对论文中 Reference Selection 小节的一些疑问

根据论文reference一共选取100个，每个汉字有3个对应的参考，关于reference的选取 "Once the character contains two or more new components, we add this character to our reference set."是不是说部件越多越复杂的汉字更应作为reference呢？像 "夒、𩱀、𩱪"
"一、乙" 这种独体字不必作为reference，因为可以由其他包含它们的reference（如大、艺）生成，对吗？
论文中提及的ids.txt里，部分行包含了一些其他符号，如 "U+4E0E 与 ⿹②一[GTKV] ⿻②一[J]"，这里的②似乎代替了无法表示的一个部件，但很显然这种表示会跟其他同笔画数的部件重复，请问这种情况该如何处理呢？

非常感谢您耐心地看到这里，期待得到您的回复，衷心祝您身体健康工作顺利