- 26 Mar, 2020 1 commit
-
-
Zeeshan Siddiqui authored
-
- 18 Mar, 2020 2 commits
-
-
Zeeshan Siddiqui authored
-
Zeeshan Siddiqui authored
-
- 17 Mar, 2020 2 commits
-
-
Zeeshan Siddiqui authored
-
Zeeshan Siddiqui authored
-
- 16 Mar, 2020 1 commit
-
-
Sherlock authored
-
- 14 Mar, 2020 2 commits
-
-
Jesse Benson authored
-
Jesse Benson authored
-
- 12 Mar, 2020 5 commits
-
-
Edward Chen authored
-
Zeeshan Siddiqui authored
We want to implement SoftmaxCrossentropy and NegativeLossLikelihoodLoss forward training ops for opset-12 but that requires ONNX submodule to point to the latest commit to have the latest and greatest ONNX spec! - Reverse integrate changes from *.in.proto files in github ONNX repo. - Regenerate csharp/test/Microsoft.ML.OnnxRuntime.Tests/OnnxMl.cs - Disable ONNX tests that don't have op implementation for the latest opset.
-
Ethan Tao authored
1. misaligned address in atomic_add() 2. GatherNDGradKernel to use atomic_add 3. enable/add UTs for GatherNDGrad and reduction_ops using half - __CUDA_ARCH__ won't take effect on .cc code, leverage HasCudaEnvironment() instead 4. verified convergence graph and perf test - p100 is much slower than v100 on fp16 - fp16/128 need to reduce batch size from 66 to 64 to avoid OOM issue 5. verify convergence test on Dev3/v100 TBD - broken UTs related to MatmulIntegerOpTest (works on v100/windows, though)
-
Ke Deng authored
This is a draft of graph cut and wait/record to demonstrate cut and Wait/Record design. You may find sub models and profiling json under onnxruntime/test if you run "onnxruntime_test_all --gtest_filter=GradientGraphBuilderTest.TrainingSession_WithPipeline"
-
edgchen1 authored
-
- 11 Mar, 2020 4 commits
-
-
Edward Chen authored
-
Edward Chen authored
-
Edward Chen authored
-
Hariharan Seshadri authored
* Initial commit * Minor nit * Comment * Fix build * Fix build
-
- 10 Mar, 2020 5 commits
-
-
Hariharan Seshadri authored
-
Yufeng Li authored
-
Tianlei Wu authored
Update script to make it work on fine-tuned bert model exported by keras2onnx
-
Yufeng Li authored
-
Scott McKay authored
-
- 09 Mar, 2020 2 commits
-
-
Dmitri Smirnov authored
Add package download step before publishing.
-
Changming Sun authored
Discussed with Faith, because the data size is very small and changes are gradual, there is no need to delete the old data. We want to keep all the history.
-
- 06 Mar, 2020 5 commits
-
-
Tiago Koji Castro Shibata authored
* Publish release symbols * Publish symbols if IsReleaseBuild
-
Andrew Kane authored
-
pranavm-nvidia authored
Fixes a typo in the help output for `symbolic_shape_infer`
-
Tianlei Wu authored
* update GeluFusion to support pattern from PyTorch 1.4; * Fix a bug that missing the check of an edge between mul2 and root. * update script to fuse gelu from PyTorch 1.4 * Add test for python optimizer
-
Dmitri Smirnov authored
Add env var with the package name.
-
- 05 Mar, 2020 3 commits
-
-
Yufeng Li authored
-
KeDengMS authored
This change fixes #3129. When running onnxruntime as dll on Windows, CUDA does some internal cleanups when process exits. After this, any call to CUDA would cause crash. Delayload makes thread_local destructor to happen after CUDA cleanup, thus the crash.
-
Dmitri Smirnov authored
-
- 04 Mar, 2020 6 commits
-
-
Yufeng Li authored
* Implement QuantizeLinear and DequantizeLinear
-
take-cheeze authored
-
Dmitri Smirnov authored
Override native package name. Preserve managed package name the same. Specify pckage name for validation purposes. Fix up validation package name parameter.
-
Prabhat authored
-
Tianlei Wu authored
(1) Add performance test tool for bert model. (2) Add accuracy test tool to compare inference results of original and optimized bert models. (3) Add test data generator tool to create test data for onnxruntime_perf_test.exe (4) Improve bert optimization script: Verify model producer for model_type; Add warning if model is not fully optimized. (5) Add shape optimizer tool to assist developing optimization script. (6) Update readme.
-
Yufeng Li authored
* Implement MatmulInteger
-
- 03 Mar, 2020 2 commits
-
-
Changming Sun authored
Previously, we put the "bin" folder of all the CUDA verions in the system PATH. And 10.2 is in the front. It's a mess. So I've removed all of them from the system PATH env. But I need to add one of them back through build scripts. (The problem only affect the C# test, not the C/C++ tests that forked from build.py).
-
smk2007 authored
* add dml gpu pipelines * add x86 to the gpu dml dev build pipeline * Enable DML x86 builds * Fix uint64_t -> size_t warning * fix warnings * enable dml on x86 ci builds * operatorHelper 773 error uint32_t vs uint64_t * operatorHelper 773 error uint32_t vs uint64_t * make x86 pipeline use the gpu pool * more warnings * fix x86 directml path * make dml nuget package * disable tf_pnasnet_large * disable zfnet512 * make validation use wildcards * disable x86 dml gpu tests * add args. * update gpu.yml * change nupkg wildcard * add debug statements * package x86 dml nupkg * dont drop managed nuget again from dml pipeline build * Add DML EULA * directml license should be renamed to not clobber the existing license * casing on dml package.... * {} to () * fix license name * disable dml from x86 ci * typo and cr feedback * remove featurizers * ship the dml pdb as well
-