开源软件名称(OpenSource Name): patrick-llgc/Learning-Deep-Learning开源软件地址(OpenSource Url): https://github.com/patrick-llgc/Learning-Deep-Learning开源编程语言(OpenSource Language):
Jupyter Notebook
100.0%
开源软件介绍(OpenSource Introduction): Paper notes
This repository contains my paper reading notes on deep learning and machine learning. It is inspired by Denny Britz and Daniel Takeshi . A minimalistic webpage generated with Github io can be found here .
About me
My name is Patrick Langechuan Liu . After about a decade of education and research in physics, I found my passion in deep learning and autonomous driving. Currently I am leading the development of perception features at Xpeng Motors , a fast growing autonomous driving company.
What to read
Where to start?
If you are new to deep learning in computer vision and don't know where to start, I suggest you spend your first month or so dive deep into this list of papers . I did so (see my notes ) and it served me well.
Here is a list of trustworthy sources of papers in case I ran out of papers to read.
My Review Posts by Topics
I regularly update my blog in Toward Data Science .
2022-08 (1)
2022-07 (8)
PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark [Notes ] [BEVNet, lane line]
VectorMapNet: End-to-end Vectorized HD Map Learning [Notes ] [BEVNet, LLD, Hang Zhao]
PETR: Position Embedding Transformation for Multi-View 3D Object Detection [Notes ] ECCV 2022 [BEVNet]
PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images [Notes ] [BEVNet, MegVii]
M^2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation [Notes ] [BEVNet, nvidia]
BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection [Notes ] [BEVNet, NuScenes SOTA, Megvii]
CVT: Cross-view Transformers for real-time Map-view Semantic Segmentation [Notes ] CVPR 2022 oral [UTAustin, Philipp]
Wayformer: Motion Forecasting via Simple & Efficient Attention Networks [Notes ] [Behavior prediction, Waymo]
LETR: Line Segment Detection Using Transformers without Edges CVPR 2021 oral
HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps CVPR 2021 [HD mapping]
SketchRNN: A Neural Representation of Sketch Drawings [David Ha]
PolyGen: An Autoregressive Generative Model of 3D Meshes ICML 2020
SOLQ: Segmenting Objects by Learning Queries NeurlPS 2021 [Megvii, end-to-end, instance segmentation]
2022-06 (3)
BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection [Notes ] [BEVNet]
BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving [Notes ] [Jiwen Lu, BEVNet, prediction]
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation [Notes ] [BEVNet, Han Song]
BEVFormer++: Improving BEVFormer for 3D Camera-only Object Detection [Waymo open dataset challenge 1st place in mono3d]
MTRA: 1st Place Solution for 2022 Waymo Open Dataset Challenge - Motion Prediction [Waymo open dataset challenge 1st place in motion prediction]
BEVSegFormer: Bird's Eye View Semantic Segmentation From Arbitrary Camera Rigs [BEVNet]
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers CVPR 2022 [nVidia]
Efficiently Identifying Task Groupings for Multi-Task Learning NeurIPS 2021 spotlight [MTL]
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time [Google, Golden Backbone]
"The Pedestrian next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping CVPR 2022
GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation [BEVNet, Baidu]
FUTR3D: A Unified Sensor Fusion Framework for 3D Detection [Hang Zhao]
GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation [BEVNet]
MonoFormer: Towards Generalization of self-supervised monocular depth estimation with Transformers [monodepth]
Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for Autonomous Driving
cosFormer: Rethinking Softmax in Attention ICLR 2022
StretchBEV: Stretching Future Instance Prediction Spatially and Temporally [BEVNet, prediction]
MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries [BEVNet, tracking] CVPR 2022 workshop
Scene Representation in Bird’s-Eye View from Surrounding Cameras with Transformers [BEVNet, LLD] CVPR 2022 workshop
Multi-Frame Self-Supervised Depth with Transformers CVPR 2022
It's About Time: Analog Clock Reading in the Wild CVPR 2022 [Andrew Zisserman]
Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer [BEVNet]
StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving ICRA 2022
SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation [Jiwen Lu]
ONCE-3DLanes: Building Monocular 3D Lane Detection CVPR 2022
K-Lane: Lidar Lane Dataset and Benchmark for Urban Roads and Highways CVPR 2022 workshop [3D LLD]
Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in Autonomous Driving CVPR 2022 workshop
A Simple Baseline for BEV Perception Without LiDAR [TRI, BEVNet, vision+radar]
Reconstruct from Top View: A 3D Lane Detection Approach based on Geometry Structure Prior CVPR 2022 workshop
RIDDLE: Lidar Data Compression with Range Image Deep Delta Encoding CVPR 2022 [Waymo, Charles Qi]
Occupancy Flow Fields for Motion Forecasting in Autonomous Driving RAL 2022 [Waymo occupancy flow challenge]
LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection [Waymo open dataset challenge official metric]
Safe Local Motion Planning with Self-Supervised Freespace Forecasting CVPR 2021
数据闭环的核心 - Auto-labeling 方案分享
K-Lane: Lidar Lane Dataset and Benchmark for Urban Roads and Highways
2022-03 (1)
2022-02 (1)
TNT: Target-driveN Trajectory Prediction [Notes ] CoRL 2020 [prediction, Waymo, Hang Zhao]
DenseTNT: End-to-end Trajectory Prediction from Dense Goal Sets [Notes ] ICCV 2021 [prediction, Waymo, 1st place winner WOMD]
Scene Transformer: A unified architecture for predicting multiple agent trajectories [prediction, Waymo] ICLR 2022
SSIA: Monocular Depth Estimation with Self-supervised Instance Adaptation [VGG team, TTR, test time refinement, CVD]
CoMoDA: Continuous Monocular Depth Adaptation Using Past Experiences WACV 2021
MonoRec: Semi-supervised dense reconstruction in dynamic environments from a single moving camera CVPR 2021 [Daniel Cremmers]
Plenoxels: Radiance Fields without Neural Networks
Lidar with Velocity: Motion Distortion Correction of Point Clouds from Oscillating Scanning Lidars [Livox, ISEE]
NWD: A Normalized Gaussian Wasserstein Distance for Tiny Object Detection
Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation NeurIPS 2021 [Sanja Fidler]
Insta-DM: Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency AAAI 2021
Instance-wise Depth and Motion Learning from Monocular Videos NeurIPS 2020 workshop [website ]
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis ECCV 2020 oral
BARF: Bundle-Adjusting Neural Radiance Fields ICCV 2021 oral
NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo ICCV 2021 oral
Transfuser: Multi-Modal Fusion Transformer for End-to-End Autonomous Driving CVPR 2021
YOLinO: Generic Single Shot Polyline Detection in Real Time ICCV 2021 workshop [lld]
MonoRCNN: Geometry-based Distance Decomposition for Monocular 3D Object Detection ICCV 2021
MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation ICCV 2021 workshop
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection CVPR 2020 [Waymo challenge 2nd place]
Geometry-based Distance Decomposition for Monocular 3D Object Detection ICCV 2021 [mono3D]
Offboard 3D Object Detection from Point Cloud Sequences CVPR 2021 [Charles Qi]
FreeAnchor: Learning to Match Anchors for Visual Object Detection NeurIPS 2019
AutoAssign: Differentiable Label Assignment for Dense Object Detection
Probabilistic Anchor Assignment with IoU Prediction for Object Detection ECCV 2020
FOVEA: Foveated Image Magnification for Autonomous Navigation ICCV 2021 [Argo]
PifPaf: Composite Fields for Human Pose Estimation CVPR 2019
Monocular 3D Localization of Vehicles in Road Scenes ICCV 2021 workshop [mono3D, tracking]
TransformerFusion: Monocular RGB Scene Reconstruction using Transformers
Conditional DETR for Fast Training Convergence
Anchor DETR: Query Design for Transformer-Based Detector [megvii]
PGD: Probabilistic and Geometric Depth: Detecting Objects in Perspective CoRL 2021
Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression
What Makes for End-to-End Object Detection? PMLR 2021
Instances as Queries ICCV 2021 [instance segmentation]
One Million Scenes for Autonomous Driving: ONCE Dataset [Huawei]
2022-01 (1)
2021-12 (5)
2021-11 (4)
2021-10 (3)
2021-09 (11)
2021-08 (11)
2021-07 (1)
2021-06 (2)
2021-04 (5)
2021-03 (4)
2021-01 (7)
2020-12 (17)
DeFCN: End-to-End Object Detection with Fully Convolutional Network [Notes ] [Transformer, DETR]
OneNet: End-to-End One-Stage Object Detection by Classification Cost [Notes ] [Transformer, DETR]
Traffic Light Mapping, Localization, and State Detection for Autonomous Vehicles [Notes ] ICRA 2011 [traffic light, Sebastian Thrun]
Towards lifelong feature-based mapping in semi-static environments [Notes ] ICRA 2016
How to Keep HD Maps for Automated Driving Up To Date [Notes ] ICRA 2020 [BMW]
Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection [Notes ] CVPR 2021 [focal loss]
Visual SLAM for Automated Driving: Exploring the Applications of Deep Learning [Notes ] CVPR 2018 workshop
Centroid Voting: Object-Aware Centroid Voting for Monocular 3D Object Detection [Notes ] IROS 2020 [mono3D, geometry + appearance = distance]
Monocular 3D Object Detection in Cylindrical Images from Fisheye Cameras [Notes ] [GM Israel, mono3D]
DeepPS: Vision-Based Parking-Slot Detection: A DCNN-Based Approach and a Large-Scale Benchmark Dataset TIP 2018 [Parking slot detection, PS2.0 dataset]
PSDet: Efficient and Universal Parking Slot Detection [Notes ] IV 2020 [Zongmu, Parking slot detection]
PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning [Notes ] ASPLOS 2020 [pruning]
Scaled-YOLOv4: Scaling Cross Stage Partial Network [Notes ] [yolo]
Yolov5 by Ultralytics [Notes ] [yolo, spatial2channel]
PP-YOLO: An Effective and Efficient Implementation of Object Detector [Notes ] [yolo, paddle-paddle, baidu]
PointPainting: Sequential Fusion for 3D Object Detection [Notes ] [nuscenece]
MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps [Notes ] CVPR 2020 [Unseen moving objects, BEV]
Locating Objects Without Bounding Boxes [Notes ] CVPR 2019 [weighted Haussdorf distance, NMS-free]
2020-11 (18)
TSP: Rethinking Transformer-based Set Prediction for Object Detection [Notes ] ICCV 2021 [DETR, transformers, Kris Kitani]
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals [Notes ] CVPR 2020 [DETR, Transformer]
Unsupervised Monocular Depth Learning in Dynamic Scenes [Notes ] CoRL 2020 [LearnK improved ver, Google]
MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time [Notes ] ICML 2020 [Mono3D, pairwise relationship]
Argoverse: 3D Tracking and Forecasting with Rich Maps [Notes ] CVPR 2019 [HD maps, dataset, CV lidar]
The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes [Notes ] ICRA 2019
Cityscapes 3D: Dataset and Benchmark for 9 DoF Vehicle Detection CVPRW 2020 [dataset, Daimler, mono3D]
NYC3DCars: A Dataset of 3D Vehicles in Geographic Context ICCV 2013
Towards Fully Autonomous Driving: Systems and Algorithms IV 2011
Center3D: Center-based Monocular 3D Object Detection with Joint Depth Understanding [Notes ] [mono3D, LID+DepJoint]
ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection AAAI 2020 oral [mono3D]
CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection [Notes ] WACV 2021 [early fusion, camera, radar]
3D-LaneNet+: Anchor Free Lane Detection using a Semi-Local Representation [
六六分期app的软件客服如何联系?不知道吗?加qq群【895510560】即可!标题:六六分期
阅读:19280| 2023-10-27
今天小编告诉大家如何处理win10系统火狐flash插件总是崩溃的问题,可能很多用户都不知
阅读:10017| 2022-11-06
今天小编告诉大家如何对win10系统删除桌面回收站图标进行设置,可能很多用户都不知道
阅读:8343| 2022-11-06
今天小编告诉大家如何对win10系统电脑设置节能降温的设置方法,想必大家都遇到过需要
阅读:8712| 2022-11-06
我们在使用xp系统的过程中,经常需要对xp系统无线网络安装向导设置进行设置,可能很多
阅读:8657| 2022-11-06
今天小编告诉大家如何处理win7系统玩cf老是与主机连接不稳定的问题,可能很多用户都不
阅读:9686| 2022-11-06
电脑对日常生活的重要性小编就不多说了,可是一旦碰到win7系统设置cf烟雾头的问题,很
阅读:8645| 2022-11-06
我们在日常使用电脑的时候,有的小伙伴们可能在打开应用的时候会遇见提示应用程序无法
阅读:8013| 2022-11-06
今天小编告诉大家如何对win7系统打开vcf文件进行设置,可能很多用户都不知道怎么对win
阅读:8682| 2022-11-06
今天小编告诉大家如何对win10系统s4开启USB调试模式进行设置,可能很多用户都不知道怎
阅读:7549| 2022-11-06
请发表评论