yolo v9运行实录

怎样运行yolo v9

简介

yolo v9 是最新发布的yolo模型,一句话总结:比前代更好,更快,更强。

本文旨在用最简单的方法吧yolov9的代码跑起来,因此不涉及训练部分,仅教会大家怎么使用yolov9的官方权重进行图像检测。

准备

某些网址可能无法打开,建议全程加速器环境下载

  1. git工具 Git - Downloading Package (git-scm.com)
  2. miniconda 虚拟环境软件,也可以使用anaconda , 只是miniconda 更加轻量化。[Index of /anaconda/miniconda/ | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror](https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/

注意下载软件要选择合适的版本,主要是x86和arm版本软件记得区分,一般intel的cpu使用x64版本

image-20240606180754233

git安装(可选)

1.双击安装包安装

image-20240606180844600

2.一直点击下一步

image-20240606181024808

3.验证一下安装成功没有,windows+R键,左下角输入cmd,进入命令行

image-20240606181149124

在命令行中输入git -v,显示出git版本号就是安装成功:

image-20240606181247738

miniconda安装

双击安装包安装,一直点击下一步,不过我们需要记住安装路径。后面需要用到。

image-20240606182526842

安装可能需要一段时间:

image-20240606182633969

image-20240606182703322

把上图两个钩取消然后点击finish即可。

安装完成之后需要配置环境变量,在windows搜索栏搜索环境变量

image-20240606182844704

点进去选择image-20240606182927476

点进去,新建3个环境变量:

C:\Users\XIANR\miniconda3

C:\Users\XIANR\miniconda3\Library\bin

C:\Users\XIANR\miniconda3\Scripts

注意,上面三个变量中的C:\Users\XIANR\miniconda3自行替换为你的安装路径,如,小明的安装路径在D:\miniconda3,那小明的三个环境变量就是:

D:\miniconda3

D:\miniconda3\Library\bin

D:\miniconda3\Scripts

配置完成后如下:

image-20240606183449970

点击三次确定保存

然后windows+r打开CMD,输入conda -V出现Conda版本号即为安装成功

image-20240606183647866

新建python环境

在cmd中 输入conda create -n yolo python=3.9,等待一段时间后,出现下面界面,输入y

image-20240606184033184

注意:如果下载缓慢请开加速器

出现如下提示即创建环境成功:

image-20240606184146890

然后输入 conda init关掉cmd再次打开,输入conda activate yolo,当左边出现yolo字样的时候,我们的python环境就创建成功了。

image-20240606184327785

yolo源码下载

如果你在第一步安装了git

新建一个文件夹用来存放源码,在文件管理器的地址栏输入CMD,回车,会在当前建立的文件夹下打开CMD

image-20240606184605628

image-20240606184724543

输入git clone https://github.com/WongKinYiu/yolov9.git

image-20240606184939785

此时在文件夹中已经出现了yolov9的代码

image-20240606185052364

如果你没安装git

https://github.com/WongKinYiu/yolov9 中自行下载源码后解压即可。

python包安装

进入源码文件夹,在地址栏输入cmd进入命令行

image-20240606185324091

在命令行输入 conda activate yolo

image-20240606185449273

如果你有加速器

输入命令 pip install -r requirements.txt

如果你没加速器

输入命令 pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

此时会自动下载依赖并安装,可能需要等待很长时间。

image-20240606185647908

当出现successfuly 的时候,就成功了!如果你报错了,请重新来一遍,这是配环境的过程中难以避免的。

image-20240606190249283

下载预训练权重包

https://github.com/WongKinYiu/yolov9/releases 里下载预训练模型,随便挑一个pt为后缀的,下载完后,改名为yolo.pt,放到代码目录下

image-20240606191130927

运行

恭喜你来到这一步,接下来就是激动的运行时间了。

我们首先可以找几张图片放在yolo的data/images文件夹中

image-20240606190420581

回到yolo代码主目录,打开cmd.

然后在命令行中输入 python detect.py 等待一会,出现下图结果即运行成功

image-20240606191333147

图中显示我们的结果在 runs\detect\exp2文件夹中,打开文件夹即可看到结果:

image-20240606191515123

测试第2篇文章

Hi 👋. Here is the homepage of Intelligent Cognitive Systems Laboratory (iCOST), Beijing University of Posts and Telecommunications.

Table of Contents

Introduction

The Intelligent Cognitive Systems Laboratory (iCost) at BUPT (Beijing University of Posts and Telecommunications) is actively engaged in long-term research in multiple cutting-edge fields, including computer vision and embodied intelligence. Our research spans various sub-domains such as action recognition, human pose prediction and estimation, uncertainty research, multimodal and audio-visual learning, audio-visual event detection, medical image segmentation, 3D object detection, adversarial strategies, embodied navigation, and robot grasping.

Our research team has achieved substantial results, publishing numerous high-quality research papers in internationally recognized and authoritative journals such as IEEE Transactions on Image Processing (IEEE TIP), IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT), and top-tier conferences like AAAI.

In the Intelligent Cognitive Systems Laboratory, we encourage communication and collaboration among team members, fostering a rigorous and harmonious academic atmosphere. We warmly welcome scholars, researchers, and students who are passionate about artificial intelligence and related fields to join us or collaborate. We believe that everyone with a curiosity for science can find their stage here.

Papers

2D Human Pose Estimation

[PR 2024] Kinematics Modeling Network for Video-based Human Pose Estimation [paper] [code]

[TIP 2022] Relation-Based Associative Joint Location for Human Pose Estimation in Videos [paper] [code]

[KBS 2024] DHRNet: A Dual-Path Hierarchical Relation Network for Multi-Person Pose Estimation [code]

[-] BiHRNet: A Binary high-resolution network for Human Pose Estimation [paper]

3D Human Pose Estimation

[AAAI 2024] Lifting by Image - Leveraging Image Cues for Accurate 3D Human Pose Estimation [paper]

3D Human Motion Prediction

[KBS 2024] April-GCN: Adjacency Position-velocity Relationship Interaction Learning GCN for Human motion prediction [paper]

[TNNLS 2023] Learning Constrained Dynamic Correlations in Spatiotemporal Graphs for Motion Prediction [paper] [code]

[TCSVT 2023] Collaborative Multi-Dynamic Pattern Modeling for Human Motion Prediction [paper]

[TCSVT 2022] Towards more realistic human motion prediction with attention to motion coordination [paper]

[TCSVT 2021] TrajectoryCNN: a new spatio-temporal feature learning network for human motion prediction [paper] [code]

[Neurocomputing 2024] Physics-constrained Attack against Convolution-based Human Motion Prediction [paper] [code]

[Neurocomputing 2022] Temporal consistency two-stream CNN for human motion prediction [paper]

[机器人 2022] 面向人体动作预测的对称残差网络 [paper]

[MPE 2020] A Hierarchical Static-Dynamic Encoder-Decoder Structure for 3D Human Motion Prediction with Residual CNNs [paper] [code]

[Cognitive Computation and Systems 2020] Stacked residual blocks based encoder–decoder framework for human motion prediction[code]

[-]Uncertainty-aware Human Motion Prediction [paper]

[-] MSSL: Multi-scale Semi-decoupled Spatiotemporal Learning for 3D human motion prediction [paper][code]

[-] DeepSSM: Deep State-Space Model for 3D Human Motion Prediction [paper] [code]

Early Action Prediction

[TIP 2024] Rich Action-semantic Consistent Knowledge for Early Action Prediction [paper] [code]

[ICCSIP 2022] A discussion of data sampling strategies for early action prediction [paper]

[中国自动化大会 2023] An end-to-end multi-scale network for action prediction in videos

Skeleton-based Human Action Recognition

[TCSVT 2024] SiT-MLP: A Simple MLP with Point-wise Topology Feature Learning for Skeleton-based Action Recognition [paper] [code]

[RAS 2020] DWnet: Deep-wide network for 3D action recognition [paper] [code]

[-] Spatial-Temporal Decoupling Contrastive Learning for Skeleton-based Human Action Recognition [paper] [code]

Group Activity Recognition

[KBS 2024] MLP-AIR: An effective MLP-based module for actor interaction relation learning in group activity recognition [paper]

Uncertainty-aware Scene Understanding with Point Clouds

[TGRS 2023] Neighborhood Spatial Aggregation MC Dropout for Efficient Uncertainty-aware Semantic Segmentation in Point Clouds [paper] [code]

[TCSVT 2023] Instance-incremental Scene Graph Generation from Real-world Point Clouds via Normalizing Flows [paper] [code]

[TIM 2020] Multigranularity Semantic Labeling of Point Clouds for the Measurement of the Rail Tanker Component With Structure Modeling [paper] [code]

[ICRA 2021] Neighborhood Spatial Aggregation based Efficient Uncertainty Estimation for Point Cloud Semantic Segmentation [paper] [code]

[Tsinghua Science and Technology 2023] Dynamic Scene Graph Generation of Point Clouds with Structural Representation Learning [paper]

Audio Visual Learning

[TMM 2023] Leveraging the Video-level Semantic Consistency of Event for Audio-visual Event Localization [paper] [code]

[EMNLP 2023] Target-Aware Spatio-Temporal Reasoning via Answering Questions in Dynamic Audio-Visual Scenarios [paper] [code]

[计算机应用 2021] 基于关键帧筛选网络的视听联合动作识别 [paper]

[-] Past Future Motion Guided Network for Audio Visual Event Localization [paper]

Audio and Speech Processing

[ICPR 2024] Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection [paper] [code]

[Interspeech 2024] MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection [paper] [code]

Medical Image Segmentation

[IEEE-CYBER 2023] Multi-task Learning Network for CT Whole Heart Segmentation [paper]

[Biomedical Signal Processing and Control 2022] DC-net: Dual-Consistency Semi-Supervised Learning for 3D Left Atrium Segmentation from MRI [paper]

Adversarial Attack

[Neurocomputing 2023] Physics-constrained attack against convolution-based human motion prediction [paper]

Others

[MTAP2023] Transfer the global knowledge for current gaze estimation [paper]

[TCSVT 2021] Energy-based Periodicity Mining with Deep Features for Action Repetition Counting in Unconstrained Videos [paper] [code]

[ROBIO 2019]DBNet: A New Generalized Structure Efficient for Classification [paper] [code]

[-] SDVRF: Sparse-to-Dense Voxel Region Fusion for Multi-modal 3D Object Detection [paper]

Last update: August 22, 2024

Feel free to contact us at 7858833@bupt.edu.cn, or zsj@bupt.edu.cn.