1175 Commits (d6294761be89ce0eeb62d08f7ed93adff5d5d2c2)
 

Author SHA1 Message Date
AlexeyAB 61156239e0 Minor performance improvement 6 years ago
AlexeyAB dc7e7f035d improve XNOR Tensor Cores GEMM - N 2x unrolled - minor performance improvement 6 years ago
AlexeyAB 41814fc4b3 Minor fixes 6 years ago
AlexeyAB ff0733ed40 Speedup repack_input_kernel_bin() 6 years ago
AlexeyAB 2d747cab2b Minor fixes 6 years ago
AlexeyAB f91d5a5e09 Fixed __shfl() and __ballot() warnings 6 years ago
Alexey e1ec8a8b07
Update Readme.md 6 years ago
AlexeyAB f09a9c3315 XNOR uses Tensor Cores on Turing GPU CC>=7.3 (not Volta) 6 years ago
AlexeyAB e17bd9ba8f Minor fix 6 years ago
AlexeyAB a607784626 Added crnn.train.cfg just for test 6 years ago
AlexeyAB c7309c1fdb Fixed CRNN (RNN based on Convolution) layer 6 years ago
AlexeyAB bd91d0a908 Add try-catch to the http_stream.cpp 6 years ago
AlexeyAB c71354ab2e Added cudaGetLastError() for cudaHostAlloc() to reset last cuda error 6 years ago
AlexeyAB 381f90ebb8 Fixed CUDA error checking 6 years ago
AlexeyAB 2790464de1 Another compile fix 6 years ago
AlexeyAB ae8a8e6016 Compile fix 6 years ago
AlexeyAB 640bdbc063 LSTM, RNN, GRU - use connected_layer that uses cuDNN. Fixed CRNN for conv-layer with cuDNN. 6 years ago
AlexeyAB 0e1f3eaf35 Fixed DLL/SO 6 years ago
AlexeyAB 3692c174c5 Compile fix 6 years ago
AlexeyAB 110b5240a4 Fixed LSTM-layer 6 years ago
AlexeyAB 85b99872cb Use non-default stream for all CUDA-functions 6 years ago
AlexeyAB 00b87281f3 Fixed RNN (RNN, GRU, LSTM) with cuDNN (batch-norm) 6 years ago
AlexeyAB 9576cd4d89 Fixed memory allocation 6 years ago
AlexeyAB 090d934c0f Minor speedup on CPU 6 years ago
AlexeyAB 630f441e08 Minor CPU speedup - i7 6500K: 1000ms (AVX=1) instead of 1500ms (old AVX=1) and 2000ms (AVX=0) 6 years ago
AlexeyAB 1b15e2f8df Compile fix on Windows 6 years ago
Alexey da044776d1
Merge pull request #2282 from davidssmith/master 6 years ago
AlexeyAB a7366a5a0a Compile fix for CC < 7.3 6 years ago
David Smith 96773df469 add lstm_layer.o to Makefile 6 years ago
David Smith 5e778cd91e add LSTM layer 6 years ago
Alexey 29aa716bd9
Update Readme.md 6 years ago
AlexeyAB 2d3220cef5 Look at wmma::bmma_sync(), bmmaBitOpXOR, bmmaAccumulateOpPOPC 6 years ago
Alexey b47db904ee
Merge pull request #2272 from Sauraus/master 6 years ago
Antek S. Baranski 8960fbfb3f gcc on OSX required explicit return value for empty (char *) in detection_to_json 6 years ago
AlexeyAB 2cd37ec73e Another minor fix 6 years ago
AlexeyAB 0541428f78 Merge branch 'master' of github.com:AlexeyAB/darknet 6 years ago
AlexeyAB 46be08db37 Minor fix 6 years ago
Alexey ec9b989c0a
Update Readme.md 6 years ago
AlexeyAB 81f7fc2c7b Fixed network resize memory allocation 6 years ago
AlexeyAB 226322523e Fixed calc_anchors 6 years ago
Alexey 8b7494d920
Update Readme.md 6 years ago
AlexeyAB 17019854c3 XNOR minor fix 6 years ago
AlexeyAB 6e99e852ff Network resize is fixed 6 years ago
AlexeyAB 0e022d0912 Fixed timer 6 years ago
AlexeyAB 4ed6fd1ada Fix for compilation on Google Colab 6 years ago
AlexeyAB 3a51f4af74 Experimental repack 6 years ago
AlexeyAB bf6b40f4e9 Another CUDA performance improvements 6 years ago
AlexeyAB 5343aa4235 CUDA minor performance improvement 6 years ago
AlexeyAB 4c05166215 Temporary experimental XNOR on GPU (repack channels) 6 years ago
Alexey 920d792a0c
Update Readme.md 6 years ago