Topic: clip Goto Github

Some thing interesting about clip

👇 Here are 530 public repositories matching this topic...

arrowluo / clip4clip

clip,An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

User: arrowluo

Home Page: https://arxiv.org/abs/2104.08860

video-text-retrieval multimodal-learning multimodality multimodal search ranking retrieval-model retrieval msrvtt lsmdc

chrisvin / easyreveal

clip,Android Easy Reveal Library

User: chrisvin

android android-library clip easy easyreveal library reveal reveal-animations

cliport / cliport

clip,CLIPort: What and Where Pathways for Robotic Manipulation

User: cliport

Home Page: https://cliport.github.io

clip robotics vision deep-learning natural-language-processing grounding vision-language manipulation pytorch rearrangement

cvhub520 / x-anylabeling

clip,Effortless data labeling with AI support from Segment Anything and other awesome models.

User: cvhub520

labeling-tool paddle pytorch resnet sam yolo deep-learning deeplearning onnx clip

cyclomon / clipstyler

clip,Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" (CVPR 2022)

User: cyclomon

style-transfer clip

easychen / pushdeer

clip,开放源码的无App推送服务，iOS14+扫码即用。亦支持快应用/iOS和Mac客户端、Android客户端、自制设备

User: easychen

app push clip notification-service

edvince / stable-diffusion-ncnn

clip,Stable Diffusion in NCNN with c++, supported txt2img and img2img

User: edvince

clip cpp diffusion mnn ncnn onnx stable-diffusion tensorrt tnn android

eps696 / aphantasia

clip,CLIP + FFT/DWT/RGB = text to image/video

User: eps696

text-to-image clip text-to-video

florent37 / flutter-shapeofview

clip,Give a custom shape to any flutter widget, Material Design 2 ready

User: florent37

Home Page: https://pub.dev/packages/shape_of_view

flutter dart shape clip diagonal arc material behavior star circle

haltakov / natural-language-image-search

clip,Search photos on Unsplash using natural language

User: haltakov

unsplash clip machine-learning computer-vision image-search photos

haltakov / natural-language-youtube-search

clip,Search inside YouTube videos using natural language

User: haltakov

machine-learning computer-vision search youtube clip

hila-chefer / targetclip

clip,[ECCV 2022] Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

User: hila-chefer

image-generation clip stylegan2 computer-graphics eccv2022 image-editing image-manipulation

hila-chefer / transformer-mm-explainability

clip,[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

User: hila-chefer

transformers transformer vqa detr visualization explainability explainable-ai interpretability lxmert visualbert

iceclear / clip-iqa

clip,[AAAI 2023] Exploring CLIP for Assessing the Look and Feel of Images

User: iceclear

iqa clip

j-min / clip-caption-reward

clip,PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)

User: j-min

Home Page: https://arxiv.org/abs/2205.13115

clip image-captioning reinforcement-learning vision-and-language

jingyi0000 / vlm_survey

clip,Collection of AWESOME vision-language models for vision tasks

User: jingyi0000

computer-vision deep-learning knowledge-distillation survey transfer-learning vision-language-model clip multi-modal-model

keshiim / zmjimageeditor

clip,ZMJImageEditor is a picture editing component like WeChat. It is powerful and easy to integrate, supporting rendering, text, rotation, tailoring, mapping and other functions. (ZMJImageEditor 是一个和微信一样图片编辑的组件，功能强大，极易集成，支持绘制、文字、旋转、剪裁、贴图等功能)

User: keshiim

image editor wechat image-editor editor-helper imageeditor rotation clip draw testing

leondgarse / keras_cv_attention_models

clip,Keras beit,caformer,CMT,CoAtNet,convnext,davit,dino,efficientdet,edgenext,efficientformer,efficientnet,eva,fasternet,fastervit,fastvit,flexivit,gcvit,ghostnet,gpvit,hornet,hiera,iformer,inceptionnext,lcnet,levit,maxvit,mobilevit,moganet,nat,nfnets,pvt,swin,tinynet,tinyvit,uniformer,volo,vanillanet,yolor,yolov7,yolov8,yolox,gpt2,llama2, alias kecam

User: leondgarse

tensorflow visualizing keras attention model imagenet coco recognition detection tf

liruiw / gensim

clip,GenSim: Generating Robotic Simulation Tasks via Large Language Models

User: liruiw

Home Page: https://liruiw.github.io/gensim

clip gpt-4 llm pybullet simulation

marqo-ai / marqo

clip,Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Organization: marqo-ai

Home Page: https://www.marqo.ai/

deep-learning information-retrieval machinelearning vector-search tensor-search clip multi-modal search-engine transformers vision-language

mbzuai-oryx / video-chatgpt

clip,"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Organization: mbzuai-oryx

Home Page: https://mbzuai-oryx.github.io/Video-ChatGPT

chatbot clip gpt-4 llama llava mulit-modal vicuna video-chatboat video-conversation vision-language vision-language-pretraining

mohamadzeina / disco_diffusion_local

clip,Getting the latest versions of Disco Diffusion to work locally, instead of colab. Including how I run this on Windows, despite some Linux only dependencies ;)

User: mohamadzeina

disco-diffusion clip vqgan-clip pytti python-text-to-image text-to-image media-synthesis art disco-diffusion-local disco-diffusion-windows

monatis / clip.cpp

clip,CLIP inference in plain C/C++ with no extra dependencies

User: monatis

c clip cpp ggml image-search multimodal

ofa-sys / chinese-clip

clip,Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Organization: ofa-sys

chinese computer-vision multi-modal-learning nlp pytorch vision-and-language-pre-training image-text-retrieval clip pretrained-models vision-language

omerbt / text2live

clip,Official Pytorch Implementation for "Text2LIVE: Text-Driven Layered Image and Video Editing" (ECCV 2022 Oral)

User: omerbt

Home Page: https://text2live.github.io/

eccv2022 image-editing text2live clip generative-model image-manipulation video-editing text-driven-editing single-image single-video

open-compass / vlmevalkit

clip,Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks

Organization: open-compass

Home Page: https://rank.opencompass.org.cn/leaderboard-multimodal

gpt-4v large-language-models llava multi-modal openai vqa llm openai-api mplug-owl qwen

open-mmlab / mmpretrain

clip,OpenMMLab Pre-training Toolbox and Benchmark

Organization: open-mmlab

Home Page: https://mmpretrain.readthedocs.io/en/latest/

image-classification resnet mobilenet pytorch deep-learning swin-transformer beit clip constrastive-learning convnext

opengvlab / instruct2act

clip,Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

Organization: opengvlab

llm robotics segment-anything chatgpt clip

pablosichert / react-truncate

clip,React component for truncating multi-line spans and adding an ellipsis.

User: pablosichert

Home Page: https://www.webpackbin.com/bins/-Kw6QnAkjmv1OD6Of-ZD

react truncate ellipsis clip

paddlepaddle / paddlemix

clip,Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

Organization: paddlepaddle

aigc stable-diffusion blip2 clip minigpt4 image-to-text text-to-image ppdiffusers controlnet multimodal

paddlepaddle / passl

clip,PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision Transformer，DEiT，Swin Transformer，CvT，T2T-ViT，MLP-Mixer，XCiT，ConvNeXt，PVTv2 等基础视觉算法

Organization: paddlepaddle

deep-learning moco moco-v2 simclr clip self-supervised-learning paddle swin-transformer vision-transformer beit

pathologyfoundation / plip

clip,Pathology Language and Image Pre-Training (PLIP) is the first vision and language foundation model for Pathology AI (Nature Medicine). PLIP is a large-scale pre-trained model that can be used to extract visual and language features from pathology images and text description. The model is a fine-tuned version of the original CLIP model.

Organization: pathologyfoundation

artificial-intelligence clip pathology vision-and-language

patrickjohncyh / fashion-clip

clip,FashionCLIP is a CLIP-like model fine-tuned for the fashion domain.

User: patrickjohncyh

nlp clip ecommerce nlp-machine-learning fashion multi-modal transformer

pengsongyou / openscene

clip,[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies

User: pengsongyou

Home Page: https://pengsongyou.github.io/openscene

3d-scene-understanding clip semantic-segmentation llm cvpr2023 point-cloud-segmentation point-clouds scannet matterport3d nuscenes

pharmapsychotic / clip-interrogator

clip,Image to prompt with BLIP and CLIP

User: pharmapsychotic

clip pytorch

qin2dim / hcaptcha-challenger

clip,🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.

User: qin2dim

Home Page: https://docs.captchax.top/

yolov5 hcaptcha opencv-python onnx-models hcaptcha-solver solver onnx yolo onnxruntime playwright

roboflow / awesome-openai-vision-api-experiments

clip,Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥

Organization: roboflow

chatgpt computer-vision openai classification clip zero-shot grounding-dino open-vocabulary-detection open-vocabulary-segmentation segment-anything

rom1504 / clip-retrieval

clip,Easily compute clip embeddings and build a clip retrieval system with them

User: rom1504

Home Page: https://rom1504.github.io/clip-retrieval/

semantic-search deep-learning multimodal ai clip knn

ruffianzhong / rwidgethelper

clip,Android UI 快速开发，专治原生控件各种不服

User: ruffianzhong

state selector corner circle drawableleft textview imageview gradient shape drawablewithtext

sense-gvt / declip

clip,Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Organization: sense-gvt

big-model clip image-text multi-model self-supervised vision-language-pretraining zero-shot

skalskip / awesome-foundation-and-multimodal-models

clip,👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]

User: skalskip

blip clip foundational-models grounding-dino llava multimodal segment-anything computer-vision nlp open-vocabulary-detection

skyworkaigc / skypaint-ai-diffusion

clip,基于Stable Diffusion优化的AI绘画模型。支持输入中英文文本，可生成多种现代艺术风格的高质量图像。| An optimized text-to-image model based on Stable Diffusion. Both Chinese and English text inputs are available to generate images. The model can generate high-quality images in several modern art styles.

User: skyworkaigc

Home Page: https://sky-paint.singularity-ai.com/index.html#/

dreambooth machine-learning text-to-image bert clip cv latent-diffusion openai pytorch ai-painting

unum-cloud / uform

clip,Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Organization: unum-cloud

Home Page: https://unum-cloud.github.io/uform/

huggingface-transformers language-vision multimodal pytorch semantic-search transformer cross-attention vector-search bert neural-network

v-iashin / video_features

clip,Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, ResNet features.

User: v-iashin

Home Page: https://v-iashin.github.io/video_features

pytorch multi-gpu feature-extraction parallel video-features visual-features audio-features i3d vggish r2plus1d