Insights: pytorch/pytorch
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
2 Pull requests merged by 2 people
-
[EZ] Get rid of utf-8 quotes
#124932 merged
Apr 25, 2024 -
Specify the exact table we upload metrics to
#124321 merged
Apr 23, 2024
164 Pull requests opened by 105 people
-
Add basic sanity checks for graph ops to cache key
#124745 opened
Apr 23, 2024 -
Fix issue 112919
#124746 opened
Apr 23, 2024 -
Update descriptor fields to resolve fft precision issue
#124756 opened
Apr 23, 2024 -
Allow device tensors that use numpy for serialization to use weights_only unpickler
#124763 opened
Apr 23, 2024 -
[FSDP] Errored on wrapping `ModuleList`/`ModuleDict`
#124764 opened
Apr 23, 2024 -
[DCP] Adds storage metadata, and passes it during the save path
#124772 opened
Apr 23, 2024 -
Add Sanity Testing to Pytorch Profiler
#124773 opened
Apr 23, 2024 -
[sparse] Fix type-dispatch errors
#124777 opened
Apr 23, 2024 -
Attach stack traces to plain callables when using aot_export_joint_simple
#124792 opened
Apr 23, 2024 -
Meta kernel for _pack_padded_sequence
#124794 opened
Apr 23, 2024 -
[rfc]: vendor in open-telemetry
#124800 opened
Apr 23, 2024 -
[NT] Support NestedTensor in is_concrete_int
#124803 opened
Apr 23, 2024 -
Include support for the scatter gather cuda kernels to allow for comp…
#124809 opened
Apr 24, 2024 -
Upgrade nightly wheels to rocm6.1
#124811 opened
Apr 24, 2024 -
[DONT MERGE][dynamo] Turn on inlining of nn modules
#124815 opened
Apr 24, 2024 -
Prevent rendezvous shutdown on worker restarts
#124819 opened
Apr 24, 2024 -
[inductor] add uint8 SDPA pattern
#124832 opened
Apr 24, 2024 -
[privateuse1] _refs.masked_fill support privateuse1 when value.device.type is cpu
#124835 opened
Apr 24, 2024 -
update xla pin
#124839 opened
Apr 24, 2024 -
[RFC] Switch from black to ruff fmt
#124845 opened
Apr 24, 2024 -
[not for land] check CI with symint set_() fixes
#124867 opened
Apr 24, 2024 -
[dtensor] implement shard dim change with alltoall
#124872 opened
Apr 24, 2024 -
[pipelining] Add util and debug facilities
#124875 opened
Apr 24, 2024 -
[dtensor] delete the old unused mesh_alltoall
#124879 opened
Apr 24, 2024 -
[dynamo][eval_frame] Create a dynamic wrapper fn to avoid cache collisions
#124881 opened
Apr 24, 2024 -
Added FixedQParam per-tensor observer for weight tensors
#124883 opened
Apr 24, 2024 -
Add Efficient Attention support on ROCM
#124885 opened
Apr 24, 2024 -
[easy] remove some unecessary windows skips for mmap tests
#124891 opened
Apr 24, 2024 -
[export] handle weight sharing in FQN mapping + unflattening
#124892 opened
Apr 24, 2024 -
test_cuda.py test_grad_scaling_autocast_fused_optimizers migration to OptimizerInfo
#124893 opened
Apr 24, 2024 -
[MPS] Remove in place views (causes too many crashes)
#124895 opened
Apr 25, 2024 -
Implemented isin_Tensor_Tensor_out for MPS backend
#124896 opened
Apr 25, 2024 -
[optim]fix ut and sgd kernel
#124904 opened
Apr 25, 2024 -
[optim] add fused_adagrad support for CPU device
#124905 opened
Apr 25, 2024 -
[onnx.export] Avoid linear look up in env for exist_in_env
#124909 opened
Apr 25, 2024 -
[onnx.export] Cache SetGraphInputTypeReliable
#124912 opened
Apr 25, 2024 -
test
#124916 opened
Apr 25, 2024 -
Modify device check in capturable optimizer to support more devices
#124919 opened
Apr 25, 2024 -
remove empty partition
#124920 opened
Apr 25, 2024 -
Support aten operations with out tensor
#124926 opened
Apr 25, 2024 -
[Inductor max autotune] Make autotune_select_algorithm more robust
#124928 opened
Apr 25, 2024 -
[Inductor cutlass backend] Remove epilogue nodes from Kernel call
#124929 opened
Apr 25, 2024 -
[Inductor cutlass backend] Fix cutlass_utils.get_max_alignment() for strided layouts.
#124930 opened
Apr 25, 2024 -
[DCP] Provides default AsyncStager
#124939 opened
Apr 25, 2024 -
[DCP] Move async logic into filesystem for better encapsulation
#124944 opened
Apr 25, 2024 -
[export] disable_forced_specializations
#124949 opened
Apr 25, 2024 -
[pipelining] Add stage backward function
#124958 opened
Apr 25, 2024 -
Wrap the test func with try/except to always call destroy_process_group
#124961 opened
Apr 25, 2024 -
[dtensor] Make distribute_tensor support distributing DTensors
#124962 opened
Apr 25, 2024 -
[wip] add comm params in et
#124963 opened
Apr 25, 2024 -
PT2 Inductor ComboKernels
#124969 opened
Apr 25, 2024 -
Delete erroneous print
#124972 opened
Apr 25, 2024 -
[dynamo] Add ID_MATCH guards on inlined functions to force compilation on monkeypatching
#124975 opened
Apr 25, 2024 -
Fix bfloat16 serialization for ONNXProgram.save
#124977 opened
Apr 25, 2024 -
[dynamo] support torchbind object input
#124978 opened
Apr 25, 2024 -
WIP: [Inductor] log fusion failure due to loop orders
#124986 opened
Apr 26, 2024 -
[Distributed] [7/N] Fix clang-tidy warnings in torch/csrc/distributed/c10d
#124987 opened
Apr 26, 2024 -
add meta for segment_reduce_backward
#124988 opened
Apr 26, 2024 -
Print flexibility for tensor output
#124991 opened
Apr 26, 2024 -
Remove Inductor IRs for legacy functional collectives
#124992 opened
Apr 26, 2024 -
Enable UFMT on `test/test_datapipe.py`
#124994 opened
Apr 26, 2024 -
[BE] Remove JNI from libtorch builds
#124995 opened
Apr 26, 2024 -
[ROCm] Enable int_mm_error tests for rocm 6.0+
#124999 opened
Apr 26, 2024 -
[Torch] Add more mm kernel choices
#125000 opened
Apr 26, 2024 -
Convert `ForeachFuncInfo` to `dataclass`
#125001 opened
Apr 26, 2024 -
Fix process group initialize twice error in distributed launcher test
#125006 opened
Apr 26, 2024 -
Adding Compare in torch.utils.benchmark documentation
#125009 opened
Apr 26, 2024 -
[DataLoader] Select available CUDA or 3rd devices automatically to pin memory
#125016 opened
Apr 26, 2024 -
[AMD] [Draft] New inductor gemm configs
#125017 opened
Apr 26, 2024 -
[BE] Remove static files from libtorch builds
#125027 opened
Apr 26, 2024 -
[BE]: Update ruff to v0.4.2
#125031 opened
Apr 26, 2024 -
[WIP] Introduce bind_unbacked HOP
#125034 opened
Apr 26, 2024 -
[Profiler] Add TSC Clock Callback to CUPTI
#125036 opened
Apr 26, 2024 -
[Torch][Timer] Skip expired timer logging for empty expired timers
#125039 opened
Apr 26, 2024 -
[unwind] replace LONG_LONG_MAX by the portable LLONG_MAX
#125043 opened
Apr 26, 2024 -
Remove caffe2 image and video
#125045 opened
Apr 26, 2024 -
[WIP] Added a function to calculate the deterministic version of MaxPool3D …
#125048 opened
Apr 26, 2024 -
[TD] Enable td on cpu windows
#125049 opened
Apr 26, 2024 -
[Draft] Warning message when user calls unscriptable component
#125053 opened
Apr 26, 2024 -
[aoti] Add pt2 package save/load (python)
#125054 opened
Apr 26, 2024 -
[BE] Make macos build libtorch with wheel
#125060 opened
Apr 26, 2024 -
[dynamo, 3.12] xfail refleaking tests due to buggy getattr_static
#125062 opened
Apr 26, 2024 -
Skip ONNX optimization when it fails
#125063 opened
Apr 26, 2024 -
Force specialization of bool in scaled_dot_product_attention
#125067 opened
Apr 26, 2024 -
[ROCm] Implement forward AD for miopen_batch_norm
#125069 opened
Apr 26, 2024 -
Updating optims and combining torch functions
#125071 opened
Apr 26, 2024 -
Forcing specialization of bool from symbool
#125072 opened
Apr 26, 2024 -
[DTensor] allow numel 1 tensor operand to be implicitly replicate DTensor
#125073 opened
Apr 26, 2024 -
Adding state_dicts test to adam_test
#125076 opened
Apr 26, 2024 -
forward fix preferred blas backend and windows CI
#125080 opened
Apr 26, 2024 -
add uuid in cudaDeviceProperties
#125083 opened
Apr 26, 2024 -
[dnl] add NCCL/PT debug log for S413673
#125085 opened
Apr 27, 2024 -
[WIP] Implement matrix_exp Batching Rule
#125086 opened
Apr 27, 2024 -
[inductor][easy] add buffer layout to SchedulerNode.debug_str
#125088 opened
Apr 27, 2024 -
[inductor] add triton code to SchedulerNode.debug_str
#125089 opened
Apr 27, 2024 -
Remove caffe2 db
#125092 opened
Apr 27, 2024 -
[Test][Distributed] Make more tests multi-threaded.
#125095 opened
Apr 27, 2024 -
[pruning]add dropout to list of supported activation functions
#125101 opened
Apr 27, 2024 -
Enable clang-tidy coverage on torch/csrc/distributed/c10d/*
#125102 opened
Apr 27, 2024 -
[WIP] make torch.amp.autocast more generic
#125103 opened
Apr 27, 2024 -
Add extra cuda_to_hip_mappings.py
#125108 opened
Apr 27, 2024 -
Allow linalg.lstsq to use svd to compute the result for rank deficient matrices.
#125110 opened
Apr 28, 2024 -
Enable UFMT on test_indexing&test_view_ops
#125112 opened
Apr 28, 2024 -
Use BFloat16 in distributed quantization when supported by NCCL
#125113 opened
Apr 28, 2024 -
Move autocast op list to autocast_mode.h to make sure other backends can reuse it.
#125114 opened
Apr 28, 2024 -
Add propagate_real_tensors mode for unbacked
#125115 opened
Apr 28, 2024 -
Enable UFMT on `test_decomp.py`, `test_expanded_weights.py` and some files
#125117 opened
Apr 28, 2024 -
Refactor and Fix Some Prombles on Autocast
#125118 opened
Apr 28, 2024 -
[inductor] Check if n is the input tensor of conv_pointwise
#125119 opened
Apr 28, 2024 -
[Storage_ipc] Provides IPC extensions for 3rd devices.
#125122 opened
Apr 28, 2024 -
Updated test_graph_optims and test_graph_scaling_fused_optimizers to use new OptimizerInfo infrastructure
#125127 opened
Apr 28, 2024 -
Fix exception handling in torch._dynamo.utils.same and add corresponding test
#125132 opened
Apr 29, 2024 -
Fix bug in graph partitioner and update graph signature after partitioning.
#125133 opened
Apr 29, 2024 -
[PT2][Optimus] Read the patterns from the config instead of hard-code passes
#125136 opened
Apr 29, 2024 -
Add templated attention BLOCK_M & BLOCK_N default size for different head_dim
#125139 opened
Apr 29, 2024 -
Remove Caffe2 python
#125143 opened
Apr 29, 2024 -
Fix AttributeError when doing mock patch for FileTimerServerTest.test_expired_timers
#125144 opened
Apr 29, 2024 -
add avx512 specialization for vec_shuffle_down
#125147 opened
Apr 29, 2024 -
save the reciprocal of weights for welford_reduce
#125148 opened
Apr 29, 2024 -
[inductor][cpp] move some common cpp utils to cpp_utils.py
#125152 opened
Apr 29, 2024 -
Merge the pyi files into py files of optimizer
#125153 opened
Apr 29, 2024 -
Test benchmark suite with data dependent options on for dynamic shapes
#125156 opened
Apr 29, 2024 -
WIP save to cache
#125157 opened
Apr 29, 2024 -
[inductor] autotune benchmark support for cpu
#125159 opened
Apr 29, 2024 -
Renable running before
#125160 opened
Apr 29, 2024 -
[MPS] And naive int8 and int4 Linear
#125163 opened
Apr 29, 2024 -
temp
#125164 opened
Apr 29, 2024 -
ignore verify placeholders
#125165 opened
Apr 29, 2024 -
fix more invalid inputs, remove fuse
#125166 opened
Apr 29, 2024 -
Disable running before and update error message to be less verbose
#125167 opened
Apr 29, 2024 -
Enable running before
#125168 opened
Apr 29, 2024 -
Enable fuse
#125169 opened
Apr 29, 2024 -
[ATen][CUDA][AMP] Fix dtype mismatch in linalg_vector_norm
#125175 opened
Apr 29, 2024 -
TEST: add some random logging
#125176 opened
Apr 29, 2024 -
Add a space on APPEND to CUDA flags
#125178 opened
Apr 29, 2024 -
Refactored _remove_auto_functionalization_from_graph_helper
#125180 opened
Apr 29, 2024 -
Add `write_record_metadata` to PyTorchFileWriter
#125184 opened
Apr 29, 2024 -
[export] Don't create a new fake mode if dynamo tracing
#125185 opened
Apr 29, 2024 -
Use torch._check for safety assert in _reshape_view_helper
#125187 opened
Apr 29, 2024 -
Don't short circuit if shape is same
#125188 opened
Apr 29, 2024 -
[export] Fix for unflattening modules with duplicate tensors
#125192 opened
Apr 29, 2024 -
[ez][CI] Move test_modules and test_schema_check off CI_SERIAL_LIST
#125193 opened
Apr 29, 2024 -
Fix bug in get_update_constraint
#125194 opened
Apr 29, 2024 -
Fix mem size mismatch from split/chunk in const folding
#125199 opened
Apr 29, 2024 -
[compiled autograd] compile fwd in inference mode
#125201 opened
Apr 29, 2024 -
[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules
#125202 opened
Apr 29, 2024 -
[dynamo] support inactive context managers across graph breaks
#125203 opened
Apr 30, 2024 -
FP8 rowwise scaling
#125204 opened
Apr 30, 2024 -
Fix PT2E Dynamic Quant regression
#125207 opened
Apr 30, 2024 -
[quant][pt2e] Fix conv-bn weight + bias per channel QAT
#125208 opened
Apr 30, 2024 -
Export `torch.jit.interface` from `torch.jit` package
#125209 opened
Apr 30, 2024 -
[Don't Merge] dump indoctor build command.
#125210 opened
Apr 30, 2024 -
Enable AOTI shim v2 build and add into libtorch
#125211 opened
Apr 30, 2024 -
Intel GPU: specify the tolerance for torchbench models
#125213 opened
Apr 30, 2024 -
fix loading optimizer options from archive
#125215 opened
Apr 30, 2024 -
Require nnz==0 in sparse meta tensors
#125221 opened
Apr 30, 2024 -
[autocast] using new autocast api with device name.
#125225 opened
Apr 30, 2024 -
fix: typo
#125226 opened
Apr 30, 2024 -
Fix logic to find sbgemm in BLAS library
#125227 opened
Apr 30, 2024 -
Fix random_mps_impl to accept non-contiguous tensors
#125231 opened
Apr 30, 2024 -
Change to fix cuda python on corp based cluster (#117789)
#125232 opened
Apr 30, 2024 -
[AOTI] Update C-shim codegen to handle rocm
#125233 opened
Apr 30, 2024 -
[ncclx] Rename NCCL-EXP to NCCLX
#125238 opened
Apr 30, 2024 -
[Inductor] Properly package target info for triton.compile
#125241 opened
Apr 30, 2024
116 Issues closed by 43 people
-
Cudnn 9 is out!
#119400 closed
Apr 30, 2024 -
Add option to `torch.load(mmap=True)` to do `MAP_SHARED` rather than `MAP_PRIVATE`
#124528 closed
Apr 30, 2024 -
DISABLED test_comprehensive_nn_functional_conv1d_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#123874 closed
Apr 30, 2024 -
DISABLED test_super_resolution_cuda (__main__.TestModels)
#105332 closed
Apr 30, 2024 -
DISABLED test_torchvision_models_efficientnet_v2_m (__main__.TestVisionTracing)
#124152 closed
Apr 30, 2024 -
device_mesh / fsdp issue with _get_device_handle
#124327 closed
Apr 30, 2024 -
Unable to use `torch.compile()` with triton's `TensorWrapper` in custom triton kernel
#124601 closed
Apr 30, 2024 -
ERROR: No matching distribution found for torchvision==0.6.0+cu121
#124587 closed
Apr 30, 2024 -
DISABLED test_torchvision_models_alexnet (__main__.TestVisionTracing)
#123908 closed
Apr 30, 2024 -
problematic math backend for F.scaled_dot_product_attention in ROCm 6.0 when testing using vllm for generate
#119389 closed
Apr 30, 2024 -
DISABLED test_open_device_registration (__main__.TestCppExtensionOpenRgistration)
#100152 closed
Apr 29, 2024 -
How can i train my model with torchrun on multiple GPUs but without DDP?
#125012 closed
Apr 29, 2024 -
[typing] `nn.Parameter` return type identified as `Tensor` by `pyright`
#125105 closed
Apr 29, 2024 -
DISABLED test_max_autotune_remote_caching_dynamic_False (__main__.TestMaxAutotune)
#121166 closed
Apr 29, 2024 -
DISABLED test_expand_cuda (__main__.TestUnbackedSymintsCUDA)
#124074 closed
Apr 29, 2024 -
DISABLED test_binary_op_list_error_cases__foreach_add_cuda_bool (__main__.TestForeachCUDA)
#122900 closed
Apr 29, 2024 -
DISABLED test_max_autotune_remote_caching_dynamic_True (__main__.TestMaxAutotune)
#121194 closed
Apr 29, 2024 -
Dataloader codeowner
#124473 closed
Apr 29, 2024 -
torch.export.export doesn't capture input argument names from the module's forward(...) function.
#122842 closed
Apr 29, 2024 -
CI with >8G CUDA memory
#18856 closed
Apr 29, 2024 -
No continuous integration coverage for Python 2 CUDA
#21467 closed
Apr 29, 2024 -
Auto-applying labels if one-or-more Labels are applied?
#117051 closed
Apr 29, 2024 -
[ignore this] Testing
#125189 closed
Apr 29, 2024 -
Warning can not be disabled ?
#123626 closed
Apr 29, 2024 -
Bug when indexing 2D tensors using an MPS device
#125100 closed
Apr 29, 2024 -
[inductor] Get wrong results when supporting module buffer mutation
#124583 closed
Apr 29, 2024 -
[dynamo] "step unsupported" graph break will make dynamo can't completely trace code after break
#125138 closed
Apr 29, 2024 -
use bfloat16 on nvidia V100 GPU
#124996 closed
Apr 29, 2024 -
doc link failure of `torch.compile`
#125123 closed
Apr 29, 2024 -
DISABLED test_triton_kernel_extern_kernel_arg_abi_compatible_cuda (__main__.AOTInductorTestABICompatibleCuda)
#118544 closed
Apr 29, 2024 -
DISABLED test_torchvision_models_regnet_y_8gf (__main__.TestVisionTracing)
#123977 closed
Apr 29, 2024 -
DISABLED test_torchvision_models_regnet_y_16gf (__main__.TestVisionTracing)
#123976 closed
Apr 29, 2024 -
GuardOnDataDependent error with differentiable output that has data dependent size
#124766 closed
Apr 28, 2024 -
Can't fine-tune a MeloTTS model on my dataset in Google Colab
#125128 closed
Apr 28, 2024 -
`torch.autocast` produces confusing error message when passing `torch.device`
#124738 closed
Apr 28, 2024 -
Running speechbrain demo on aarch64 with pytorch 2.1.2 is much slower than pytorch 1.10.0
#123143 closed
Apr 28, 2024 -
[inductor][cpu]basic_gnn_gcn AMP static/dynamic shape default/cpp wrapper single thread performance regression
#123502 closed
Apr 28, 2024 -
TOR901 lint is too aggressive
#125050 closed
Apr 27, 2024 -
[numpy] Add torch.newdim/torch.newaxis
#65307 closed
Apr 27, 2024 -
Compiled model raises error "attn_bias is not correctly aligned" in pytorch 2.2
#121943 closed
Apr 27, 2024 -
Unexpected instruction types specified for 'sub' on TIMM seresnext26d_32x4d model
#118589 closed
Apr 27, 2024 -
UNSTABLE rocm / linux-focal-rocm6.0-py3.8 / test (default)
#119908 closed
Apr 27, 2024 -
[MPS] `torch.nextafter` incorrect handling of negative inputs
#124985 closed
Apr 27, 2024 -
[INDUCTOR] [CPU] [GPT-FAST-MOE] large perf regression with coordinate_descent_tuning disabled
#124697 closed
Apr 27, 2024 -
DISABLED test_comprehensive_fft_fft_cuda_float64 (__main__.TestInductorOpInfoCUDA)
#122715 closed
Apr 26, 2024 -
[ONNX] Memory leak
#86518 closed
Apr 26, 2024 -
torch dynamo's ORT backend uses "ort" dispatch key instead of "maia"
#124966 closed
Apr 26, 2024 -
DISABLED test_all_to_all_single_inductor (__main__.TestFunctionalAutograd)
#123933 closed
Apr 26, 2024 -
DISABLED test_load_tensor_cuda (__main__.TestContentStoreCUDA)
#123849 closed
Apr 26, 2024 -
DISABLED test_buffer_mutation_3_non_abi_compatible_cuda (__main__.AOTInductorTestNonABICompatibleCuda)
#123321 closed
Apr 26, 2024 -
DISABLED test_equivalent_backed_unbacked_cuda (__main__.TestUnbackedSymintsCUDA)
#123947 closed
Apr 26, 2024 -
"torch._dynamo.exc.Unsupported: torch.* op returned non-Tensor bool call_method is_complex" error
#122692 closed
Apr 26, 2024 -
[v.2.3.0] Release Tracker
#121760 closed
Apr 26, 2024 -
Add a GitHub actions workflow for Macos
#63466 closed
Apr 26, 2024 -
a weird bug in torch.compile
#124817 closed
Apr 26, 2024 -
h
#124993 closed
Apr 26, 2024 -
DISABLED test_torchvision_models_efficientnet_b1 (__main__.TestVisionTracing)
#123889 closed
Apr 26, 2024 -
DISABLED test_broadcast_tensors_cuda (__main__.TestUnbackedSymintsCUDA)
#123862 closed
Apr 26, 2024 -
DISABLED test_autotuning_cuda (__main__.TestUnbackedSymintsCUDA)
#123729 closed
Apr 26, 2024 -
DISABLED test_conv_unary_fusion_nnc (__main__.TestMkldnnFusion)
#123905 closed
Apr 26, 2024 -
DISABLED test_output_misaligned_non_abi_compatible_cuda (__main__.AOTInductorTestNonABICompatibleCuda)
#123818 closed
Apr 26, 2024 -
DISABLED test_basic_cuda (__main__.TestContentStoreCUDA)
#100209 closed
Apr 26, 2024 -
DISABLED test_torchvision_models_maxvit_t (__main__.TestVisionTracing)
#123918 closed
Apr 26, 2024 -
DISABLED test_comprehensive_ones_cuda_int64 (__main__.TestInductorOpInfoCUDA)
#123837 closed
Apr 26, 2024 -
DISABLED test_torchvision_models_efficientnet_b3 (__main__.TestVisionTracing)
#123891 closed
Apr 26, 2024 -
DISABLED test_torchvision_models_efficientnet_b7 (__main__.TestVisionTracing)
#123890 closed
Apr 26, 2024 -
DISABLED test_torchvision_models_densenet169 (__main__.TestVisionTracing)
#123907 closed
Apr 26, 2024 -
DISABLED test_batched_mm_bfloat16_bs_10_cuda_bfloat16 (__main__.TestDecompCUDA)
#123728 closed
Apr 26, 2024 -
Support benchmark fusion for TemplateKernel
#108716 closed
Apr 26, 2024 -
DISABLED test_comprehensive_amax_cuda_float16 (__main__.TestInductorOpInfoCUDA)
#124640 closed
Apr 26, 2024 -
[onnx] support more combinations of args/kwargs as model inputs for pytorch-onnx converter
#81478 closed
Apr 26, 2024 -
VecISA.__bool__ is very expensive (nearly a second) on startup
#100378 closed
Apr 25, 2024 -
Torch nightly wheels no longer include `torchgen` YAML files
#124941 closed
Apr 25, 2024 -
Export serializes empty list as list of bools
#123480 closed
Apr 25, 2024 -
[Dynamo] Unsupported: missing: DELETE_SUBSCR
#123317 closed
Apr 25, 2024 -
XML test-reports get overwritten in case of retry
#123882 closed
Apr 25, 2024 -
How to catch NCCL collective timeout in Python
#124887 closed
Apr 25, 2024 -
Maybe consider vendoring opentelemetry by hand, rather than submodule
#124612 closed
Apr 25, 2024 -
UNSTABLE rocm
#124951 closed
Apr 25, 2024 -
Dataloader crashes with FileNotFoundError randomly when num_workers>0 on Ubuntu 22.04
#124903 closed
Apr 25, 2024 -
torch.compile : RuntimeError: "foreach_tensor_copy" not implemented for 'Int'
#124170 closed
Apr 25, 2024 -
pytorch Windows MKL cmake file don't support static link mkl.
#124869 closed
Apr 25, 2024 -
Backwards with cat with data-dependent sizes doesn't work
#124652 closed
Apr 25, 2024 -
Support CUDA 12.4
#104417 closed
Apr 25, 2024 -
`Enum` used as a key of the input raises guards error
#111603 closed
Apr 25, 2024 -
Find a bug from beta-released "scaled_dot_product_attention"
#124464 closed
Apr 25, 2024 -
Interpolate nearest
#121390 closed
Apr 25, 2024 -
[dynamo] Tracing triton kernel unexpectedly
#122768 closed
Apr 25, 2024 -
Substitutions result in unbacked SymInts showing up before their definition sites
#123854 closed
Apr 25, 2024 -
SDPA + torch.compile: (*bias): last dimension must be contiguous
#124289 closed
Apr 24, 2024 -
Correctly handle `F.interpolate` upsample with amp
#121072 closed
Apr 24, 2024 -
[ONNX] Discuss improvements to Diagnostic public API
#103713 closed
Apr 24, 2024 -
Not loading optimizer state separately from checkpoint causes errors with FQNs
#124546 closed
Apr 24, 2024 -
Switch `libopenblas` and `libopenblas-dev` to `libopenblas64` and `libopenblas64-dev`
#123534 closed
Apr 24, 2024 -
Checkpointed function does not preserve `requires_grad` if output is a dataclass.
#124725 closed
Apr 24, 2024 -
Release 2.3 manual validations
#123736 closed
Apr 24, 2024 -
Validate cheerry-picks for release 2.3
#123734 closed
Apr 24, 2024 -
`aminmax` will trigger INTERNAL ASSERT if input is empty on cuda
#85439 closed
Apr 24, 2024 -
`test_triton_scaled_dot_product_attention_block_size_16_cuda_bfloat16` is broken on A100
#124333 closed
Apr 24, 2024 -
[PREEMPTIVE] Migration for ARC runners - Possible disrruption of jobs or increased queue times
#124831 closed
Apr 24, 2024 -
[Profiler] Maybe append a kernel to an unrelated event.
#124388 closed
Apr 24, 2024 -
schedular
#124367 closed
Apr 24, 2024 -
Clean way to distinguish python subclass NT vs. C++ NT
#110543 closed
Apr 23, 2024 -
Verbose log: [__aot_joint_graph] could not reconstruct view by re-applying a ViewMeta sequence.
#124499 closed
Apr 23, 2024 -
In `_scaled_mm`, `scale_result` not changing result tensor at all
#119135 closed
Apr 23, 2024
138 Issues opened by 87 people
-
DISABLED test_bmm_multithreaded (__main__.TestTorch)
#125240 opened
Apr 30, 2024 -
Improved strategy for dealing with deterministically flaky tests which are order sensitive
#125239 opened
Apr 30, 2024 -
torch.no_grad() is not working for dynamo inductor backend
#125236 opened
Apr 30, 2024 -
[Inductor] [Distributed] DDP torch.compile model hangs on exit (python 3.8/3.9)
#125235 opened
Apr 30, 2024 -
torch.Library can easily cause segfault on loading/unloading
#125234 opened
Apr 30, 2024 -
ROCm: `fatal error: aotriton/flash.h: No such file or directory` when building with `USE_ROCM=1`
#125230 opened
Apr 30, 2024 -
DISABLED test_variable_traverse (__main__.TestAutogradWithCompiledAutograd)
#125229 opened
Apr 30, 2024 -
DISABLED test_issue106555 (__main__.TestCompiledAutograd)
#125228 opened
Apr 30, 2024 -
Strange behavior of randint using device=cuda
#125224 opened
Apr 30, 2024 -
torch.uniform_() is single-threaded on CPU
#125223 opened
Apr 30, 2024 -
DISABLED test_var_mean_differentiable (__main__.TestAutogradWithCompiledAutograd)
#125220 opened
Apr 30, 2024 -
DISABLED test_inplace_grad_update (__main__.TestCompiledAutograd)
#125219 opened
Apr 30, 2024 -
DISABLED test_perfect_match_on_sequence_and_bool_attributes (__main__.TestFxToOnnx)
#125218 opened
Apr 30, 2024 -
MaxPool2D memory leakage on device MPS
#125217 opened
Apr 30, 2024 -
torch.inference_mode documentation not availble
#125216 opened
Apr 30, 2024 -
[NT] Implementing Multi-Head Attention with NestedTensors
#125214 opened
Apr 30, 2024 -
DISABLED test_type_conversions (__main__.TestAutogradWithCompiledAutograd)
#125206 opened
Apr 30, 2024 -
DISABLED test_unused_output (__main__.TestAutogradWithCompiledAutograd)
#125205 opened
Apr 30, 2024 -
Pytorch running on macosx-14.xlarge reports MPS is available when it is not.
#125197 opened
Apr 29, 2024 -
DISABLED test_too_many_grads (__main__.TestAutogradWithCompiledAutograd)
#125195 opened
Apr 29, 2024 -
multithreaded autograd backward doesn't respect autocast dtype context manager
#125186 opened
Apr 29, 2024 -
torch/_refs/__init__.py is not autoreloadable in bento / jupyter notebook
#125183 opened
Apr 29, 2024 -
QR Decomposition for Sparse Matrix
#125182 opened
Apr 29, 2024 -
Mitigate pypi issue with space (short term)
#125179 opened
Apr 29, 2024 -
Can't load on rank 0 only with `set_optimizer_state_dict`
#125177 opened
Apr 29, 2024 -
[CUDA][AMP] Size-1 (scalar) norms are broken on CUDA + AMP following #122143
#125174 opened
Apr 29, 2024 -
Flight Recorder Sequence IDs are insufficient
#125173 opened
Apr 29, 2024 -
datettime.now() is not supported by Dynamo
#125171 opened
Apr 29, 2024 -
Calling `get_model_state_dict/set_model_state_dict` requires forward pass for `_lazy_init`
#125170 opened
Apr 29, 2024 -
`scale` parsed as `float` in ONNX `scaled_dot_product_attention` implementation
#125158 opened
Apr 29, 2024 -
DISABLED test_comprehensive_special_bessel_y1_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#125151 opened
Apr 29, 2024 -
DISABLED test_slice_expanded_v (__main__.TestAutogradWithCompiledAutograd)
#125149 opened
Apr 29, 2024 -
Race condition in FileTimerServerTest.test_expired_timers
#125146 opened
Apr 29, 2024 -
The "step unsupported" graph break will make dynamo can't completely trace code after break
#125141 opened
Apr 29, 2024 -
Tensor.abs() gives incorrect results on Complex64 when using MPS
#125135 opened
Apr 29, 2024 -
binaries/dump_operator_names.cc missing iostream include
#125134 opened
Apr 29, 2024 -
DISABLED test_sharded_grad (__main__.TestAutogradWithCompiledAutograd)
#125130 opened
Apr 29, 2024 -
FP6 dtype!
#125129 opened
Apr 28, 2024 -
torch.is_signed on new uint dtypes raises Unknown ScalarType
#125124 opened
Apr 28, 2024 -
Does torch.nn.Linear check the weights shape before assignment?
#125116 opened
Apr 28, 2024 -
2.3.0 on Windows missing dependency?
#125109 opened
Apr 27, 2024 -
Further explanation for `batch_isend_irecv`
#125099 opened
Apr 27, 2024 -
RuntimeError: MPS: Unsupported Border padding mode
#125098 opened
Apr 27, 2024 -
PyTorch Distributed Load Updates or Returns `state_dict`
#125096 opened
Apr 27, 2024 -
Broken Docker Image on dockerhub
#125094 opened
Apr 27, 2024 -
Triton installation not found.
#125093 opened
Apr 27, 2024 -
[Distributed] P2P Operations on NCCL do not respect tag
#125079 opened
Apr 26, 2024 -
`torch.compile` fails with `jacfwd` when multiplying/dividing float and tensor
#125078 opened
Apr 26, 2024 -
[Inductor] Generate triton block pointers for discontiguous strided tensors
#125077 opened
Apr 26, 2024 -
Inductor can not fuse cat with a pointwise
#125075 opened
Apr 26, 2024 -
2.2.0+ regresses SDPA performance on Windows
#125070 opened
Apr 26, 2024 -
ValueError: weight_norm of 'weight' not found in ParametrizedConvTranspose1d
#125064 opened
Apr 26, 2024 -
404 on torch.inference_mode doc page
#125059 opened
Apr 26, 2024 -
DISABLED test_sdpa_backwards_cuda_bfloat16 (__main__.TestNestedTensorSubclassCUDA)
#125058 opened
Apr 26, 2024 -
DISABLED test_foreach_matches_forloop_RAdam_cpu_float64 (__main__.TestOptimRenewedCPU)
#125057 opened
Apr 26, 2024 -
DISABLED test_select_expanded_v (__main__.TestAutogradWithCompiledAutograd)
#125056 opened
Apr 26, 2024 -
DISABLED test_dynamic_shapes (__main__.TestCompiledAutograd)
#125055 opened
Apr 26, 2024 -
MPS backend thinks that small floats are less than zero
#125051 opened
Apr 26, 2024 -
Support auto_functionalized for None returns
#125044 opened
Apr 26, 2024 -
[Feature Request] Support `dtype` arg in `torch._foreach_norm`
#125040 opened
Apr 26, 2024 -
DISABLED test_binary_op_list_error_cases__foreach_clamp_max_cuda_float64 (__main__.TestForeachCUDA)
#125035 opened
Apr 26, 2024 -
Tensorboard's SummaryWriter.add_graph() doesn't work with packedSequence
#125033 opened
Apr 26, 2024 -
Allow operators with no returns to be "out=" operations
#125030 opened
Apr 26, 2024 -
OpInfo testing is inadequate for `nextafter`
#125028 opened
Apr 26, 2024 -
CUDA memory summary in FSDP all_gather causes excessive noise.
#125025 opened
Apr 26, 2024 -
DISABLED test_custom_fn_output_metadata (__main__.TestCompiledAutograd)
#125024 opened
Apr 26, 2024 -
DISABLED test_binary_op_float_inf_nan__foreach_clamp_min_cuda_float32 (__main__.TestForeachCUDA)
#125022 opened
Apr 26, 2024 -
DISABLED test_allocation_id_uniqueness (__main__.TestTorchTidyProfiler)
#125021 opened
Apr 26, 2024 -
DISABLED test_unbacked_cat_backwards_cuda (__main__.TestInductorDynamicCUDA)
#125019 opened
Apr 26, 2024 -
libtorch C++ windows: The specified module could not be found. mkl_vml_def.1.dll
#125013 opened
Apr 26, 2024 -
Tensor can not be accessed!
#125010 opened
Apr 26, 2024 -
Support exporting the compute graph JiT compiled by inductor?
#125007 opened
Apr 26, 2024 -
RuntimeError: Unsupported qscheme: per_channel_affine. When using quantise_fx(
#125004 opened
Apr 26, 2024 -
bool as scalar in jit ir API not working as expected.
#125003 opened
Apr 26, 2024 -
[DTensor] Sharding strategy not implemented error should be thrown earlier
#124990 opened
Apr 26, 2024 -
[DTensor] Keep track of data-dependent ops and skip _propagate_tensor_meta to go thru fake tensor
#124989 opened
Apr 26, 2024 -
[Releng] Triton version for minor releases starting with 2.4
#124974 opened
Apr 25, 2024 -
Export llama v3 to ONNX
#124973 opened
Apr 25, 2024 -
Give the possibly to get back normal `Tensor`s as `MaskedTensor` gradients
#124964 opened
Apr 25, 2024 -
`ascii` error during recompilation after cache limit is reached (likely due to `einsum`)
#124960 opened
Apr 25, 2024 -
torch.onnx.export generates an incorrect bias shape for certain conv transpose models
#124956 opened
Apr 25, 2024 -
PRs with unbalanced quotes can not be merged
#124953 opened
Apr 25, 2024 -
[distributed] First NCCL barrier does not respect timeout
#124950 opened
Apr 25, 2024 -
torch.compile fails on hugging face Mistral7b
#124946 opened
Apr 25, 2024 -
Conflict between bias=False and why_not_sparsity_fast_path in Transformer Module
#124937 opened
Apr 25, 2024 -
Tensor's storage changes computation outcome for CPU tensors.
#124934 opened
Apr 25, 2024 -
[RFC] Support reinplaceble ops for custom ops in Inductor
#124933 opened
Apr 25, 2024 -
Missing description of Transformer argument "memory_mask" shape in 3D (including the batch dimension) case
#124931 opened
Apr 25, 2024 -
Time taken to data loading increased in newer builds (ARM)
#124922 opened
Apr 25, 2024 -
Custom Operator Design for torch.compile: Must Output Tensors Always Be Returned?
#124918 opened
Apr 25, 2024 -
Exporting torch slice_scatter to onnx Identity
#124915 opened
Apr 25, 2024 -
[inductor][cpu]shufflenet_v2_x1_0 QAT performance regression in 2024-04-20 nightly release
#124913 opened
Apr 25, 2024 -
DataLoader's pin_memory is default to CUDA if parameter pin_memory_device is not set
#124908 opened
Apr 25, 2024 -
2.3.0 not backward compatible with torchdata
#124907 opened
Apr 25, 2024 -
Many cases in distributed/elastic/multiprocessing/redirects_test.py fails when use pytest
#124906 opened
Apr 25, 2024 -
Add support for IPC features for PrivateUse devices
#124902 opened
Apr 25, 2024 -
`RuntimeError: invalid dtype for bias` when use compile + autocast
#124901 opened
Apr 25, 2024 -
[dynamo] Crash when context manager object crosses a graph break
#124900 opened
Apr 25, 2024 -
Certain .pyi files are not encoded as UTF-8 in Windows
#124897 opened
Apr 25, 2024 -
Windows build step did not fail despite error
#124886 opened
Apr 24, 2024 -
ONNX dynamic sized model export with torch.onnx.dynamo_export fails when torch.nn.functional.interpolate is used
#124884 opened
Apr 24, 2024 -
Nested wrapper subclasses with torch.compile is broken
#124878 opened
Apr 24, 2024 -
SDPA memory efficient kernel returns NaNs when the query and key are different lengths
#124877 opened
Apr 24, 2024 -
tensor.dtype.to_complex() crashes kernel after ~100 calls in ipython kernel
#124868 opened
Apr 24, 2024 -
support side effects in HOPs?
#124866 opened
Apr 24, 2024 -
Deprecate unsupported types in operator registration
#124863 opened
Apr 24, 2024 -
torch._dynamo.assume_constant_result does not work outside nn.Module
#124858 opened
Apr 24, 2024 -
Use lintrunner-adapters for our adapters when possible
#124857 opened
Apr 24, 2024 -
ShapeEnv canonicalization is still over-aggressive
#124855 opened
Apr 24, 2024 -
[inductor] unexpected cuda:0 device usage when compiling and runing a model on cuda:1
#124854 opened
Apr 24, 2024 -
aten::nonzero calls taking a huge amount of time when using MPS backend vs CPU
#124850 opened
Apr 24, 2024 -
[inductor][cpu]RuntimeError: no channels last format strides exist in 1 dimensions in 2024-04-21 nightly release
#124837 opened
Apr 24, 2024 -
nn.functional.ELU output differs on MPS vs CPU if input is noncontiguous
#124834 opened
Apr 24, 2024 -
MPS RNG state fails to progress immediately after fork_rng
#124833 opened
Apr 24, 2024 -
Add privateuse1 check on capturable optimizer
#124830 opened
Apr 24, 2024 -
CI is downloading model definitions and model weights from external sources
#124825 opened
Apr 24, 2024 -
torch.compile error: Unsupported reduction type from torch.float32 to torch.int64
#124821 opened
Apr 24, 2024 -
[RFC] Mix and Match CUDA Allocators using Private Pools
#124807 opened
Apr 24, 2024 -
Dynamo Export Support for Qwen/Qwen-7B-Chat: Mutating module attribute _ntk_alpha_cached_list during export
#124796 opened
Apr 23, 2024 -
Dynamo Export Support for Google/Gemma-2B: Mutating module attribute inv_freq during export
#124793 opened
Apr 23, 2024 -
Caffe2 usage of cuDNN RNNv6 API blocks upgrade to cuDNN v9+
#124790 opened
Apr 23, 2024 -
torch.nn.checkpoint.checkpoint ignores default device in backward() call
#124788 opened
Apr 23, 2024 -
Scatter_add limitation when accumulate beyond 2^24 under float32 precision
#124783 opened
Apr 23, 2024 -
Desne-sparse broadcasted multiplication fails in the backward pass
#124778 opened
Apr 23, 2024 -
Release artifacts for rc releases
#124759 opened
Apr 23, 2024 -
DISABLED test_comprehensive_randn_like_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#124758 opened
Apr 23, 2024 -
Disable PYTORCH_TEST_WITH_DYNAMO=1 tests over the AOTAutograd tests
#124750 opened
Apr 23, 2024 -
Improve Dynamo (and other) flaky tests
#124749 opened
Apr 23, 2024 -
TestAOTAutograd.test_mem_leak_from_save_for_bw fails locally but not on CI when run with dynamo
#124747 opened
Apr 23, 2024
367 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[dynamo] Automatically convert loop bodies to function calls
#113538 commented on
Apr 26, 2024 • 30 new comments -
[Meta Tensor] fix meta inplace set storage
#123880 commented on
Apr 29, 2024 • 25 new comments -
[vision hash update] update the pinned vision hash
#123227 commented on
Apr 30, 2024 • 24 new comments -
[NT] Make NestedTensor register as having symbolic sizes/strides
#124687 commented on
Apr 27, 2024 • 22 new comments -
[executorch hash update] update the pinned executorch hash
#123043 commented on
Apr 30, 2024 • 21 new comments -
[traced-graph][sparse] propagate sparsity metadata into traced graph
#117907 commented on
Apr 30, 2024 • 20 new comments -
Setup initial testing harness and cache key generation for AOTAutograd Cache
#124642 commented on
Apr 30, 2024 • 19 new comments -
UserWarning: Plan failed with a cudnnException
#121834 commented on
Apr 30, 2024 • 19 new comments -
ARC dynamic rollout
#124721 commented on
Apr 30, 2024 • 18 new comments -
TorchInductor CPU Performance Dashboard
#93531 commented on
Apr 30, 2024 • 16 new comments -
Add CUDA 12.4 workflows
#121684 commented on
Apr 30, 2024 • 16 new comments -
Refresh OpOverloadPacket if a new OpOverload gets added
#124654 commented on
Apr 26, 2024 • 16 new comments -
Add helper function to decompose and inline nn module
#123683 commented on
Apr 29, 2024 • 15 new comments -
fix Invalid call to aoti_torch_tensor_copy_ #123039
#124037 commented on
Apr 30, 2024 • 14 new comments -
Add a cache mechanism to accelerate torch.compile-for-eager
#116368 commented on
Apr 29, 2024 • 12 new comments -
optim.apply_optimizer_in_backward does not account for gradient accumulation
#124523 commented on
Apr 30, 2024 • 11 new comments -
[RFC] Per-Parameter-Sharding FSDP
#114299 commented on
Apr 29, 2024 • 11 new comments -
General MPS op coverage tracking issue
#77764 commented on
Apr 30, 2024 • 10 new comments -
Fakify script object inputs and attributes for non-strict export
#124239 commented on
Apr 30, 2024 • 10 new comments -
[dynamo] Refactor into torch/_inductor/runtime/compile_tasks.py
#124681 commented on
Apr 30, 2024 • 9 new comments -
[inductor] Remove usage of device_interface from _inductor.runtime
#124592 commented on
Apr 30, 2024 • 8 new comments -
[custom_op] use new python custom ops API on prims ops
#124665 commented on
Apr 30, 2024 • 8 new comments -
Verify types in custom op schemas
#124520 commented on
Apr 26, 2024 • 8 new comments -
[DeviceMesh] Fix hash and eq not match
#123572 commented on
Apr 27, 2024 • 7 new comments -
ARC runners timeout during Docker updates
#124727 commented on
Apr 29, 2024 • 7 new comments -
set_default_device/torch.device has performance impact for non-factory functions
#92701 commented on
Apr 24, 2024 • 7 new comments -
[DCP] Introduce async staging extension points
#122965 commented on
Apr 25, 2024 • 7 new comments -
[ROCm] TunableOp improvements
#124362 commented on
Apr 30, 2024 • 7 new comments -
Fixes two build problems on ROCM 6.1 + Ubuntu 22.04
#118216 commented on
Apr 23, 2024 • 6 new comments -
inductor: Add Conv3d support
#124361 commented on
Apr 29, 2024 • 6 new comments -
Allow tensor subclasses and add `torch.serialization.mark_safe_globals` that allows users to allowlist classes for `weights_only` load
#124331 commented on
Apr 30, 2024 • 6 new comments -
Fix typo under torch/_inductor directory
#119658 commented on
Apr 30, 2024 • 6 new comments -
[RFC] Autoload Device Extension
#122468 commented on
Apr 26, 2024 • 6 new comments -
[Memory Snapshot] Add recordAnnotations to capture record_function annotations
#124179 commented on
Apr 26, 2024 • 6 new comments -
Add a variable for some testcases.
#124708 commented on
Apr 29, 2024 • 6 new comments -
[DCP] Adds strict option to DefaultPlanner
#123869 commented on
Apr 26, 2024 • 6 new comments -
[c10d] abort a communicator at most once
#124436 commented on
Apr 29, 2024 • 5 new comments -
[guards][cpp-guards] Optimize NN module getattr guards
#124522 commented on
Apr 30, 2024 • 5 new comments -
Better core binding in torch.backends.xeon.run_cpu when launced from torchrun with --nproc-per-node
#123711 commented on
Apr 30, 2024 • 4 new comments -
Investigate torch.compile Windows support.
#122094 commented on
Apr 30, 2024 • 4 new comments -
Add LR as tensor tests
#123750 commented on
Apr 26, 2024 • 4 new comments -
Initial implementation of Inductor FX Graph Remote Cache
#124669 commented on
Apr 23, 2024 • 4 new comments -
Pytorch Conda nightly build failures due to timeout
#124667 commented on
Apr 29, 2024 • 4 new comments -
torch.Tensor.remainder raises a floating point exception when divisor is -1
#124644 commented on
Apr 29, 2024 • 4 new comments -
RuntimeError: derivative for aten::_scaled_dot_product_flash_attention_backward is not implemented
#116350 commented on
Apr 29, 2024 • 4 new comments -
Enable dynamo-traced optimizer peak memory tests
#124543 commented on
Apr 25, 2024 • 4 new comments -
[dynamo] Support numpy.dtype
#124481 commented on
Apr 24, 2024 • 4 new comments -
Fix the specific setting in documentation to match with elsewhere
#124463 commented on
Apr 27, 2024 • 4 new comments -
Fix for addcdiv contiguous problem
#124442 commented on
Apr 25, 2024 • 4 new comments -
[Quant][PT2E] enable qlinear post op fusion for dynamic quant & qat
#122667 commented on
Apr 30, 2024 • 3 new comments -
torch.compiled model output gets overwritten despite tensor.detach()
#104435 commented on
Apr 24, 2024 • 3 new comments -
Remove caffe2/
#122527 commented on
Apr 30, 2024 • 3 new comments -
[PT2D] Make the speedup benchmark works with DDP + CompiledAutograd
#121315 commented on
Apr 24, 2024 • 3 new comments -
[ONNX] STFT ExportProgram error
#113504 commented on
Apr 26, 2024 • 3 new comments -
DISABLED test_split_with_sizes_aot_autograd_cleans_up_traceback_meta (__main__.AotAutogradFallbackTests)
#122767 commented on
Apr 26, 2024 • 3 new comments -
DISABLED test_split_with_sizes_aot_autograd_cleans_up_traceback_meta_dynamic_shapes (__main__.DynamicShapesAotAutogradFallbackTests)
#122766 commented on
Apr 26, 2024 • 3 new comments -
Implement Copy-on-write (COW) tensors
#109833 commented on
Apr 26, 2024 • 3 new comments -
aten::_linalg_solve_ex.result' is not currently implemented for the MPS
#98222 commented on
Apr 27, 2024 • 3 new comments -
ARM libtorch links to libomp but libomp is no longer bundled
#124732 commented on
Apr 29, 2024 • 3 new comments -
DISABLED test_comprehensive_special_bessel_y1_cuda_int64 (__main__.TestInductorOpInfoCUDA)
#123919 commented on
Apr 30, 2024 • 3 new comments -
DISABLED test_free_activation_memory (__main__.TestCompiledAutograd)
#123949 commented on
Apr 30, 2024 • 3 new comments -
s390x: remove workaround for sleef issue
#124730 commented on
Apr 25, 2024 • 3 new comments -
Fast standalone symbolize for unwinding
#123966 commented on
Apr 25, 2024 • 3 new comments -
[ROCm][CI] upgrade CI to ROCm 6.1
#124300 commented on
Apr 29, 2024 • 3 new comments -
[draft] cuda 124 arm wheel test
#124112 commented on
Apr 26, 2024 • 3 new comments -
[DO NOT MERGE] Test new ROCm CI nodes
#124424 commented on
Apr 30, 2024 • 3 new comments -
[dynamo] Unexpected SymBool appearing in "is_causal" inside scaled_dot_product_attention()
#124707 commented on
Apr 24, 2024 • 3 new comments -
[torch.compile][FlopCounter] AssertionError: Global is not OptimizedModule._orig_mod
#124196 commented on
Apr 23, 2024 • 3 new comments -
[xla hash update] update the pinned xla hash
#124599 commented on
Apr 29, 2024 • 3 new comments -
DISABLED test_mark_non_differentiable (__main__.TestAutogradWithCompiledAutograd)
#124470 commented on
Apr 30, 2024 • 2 new comments -
Add aten._unsafe_masked_index
#116491 commented on
Apr 29, 2024 • 2 new comments -
DISABLED test_buffer_mutation_3_abi_compatible_cuda (__main__.AOTInductorTestABICompatibleCuda)
#123251 commented on
Apr 28, 2024 • 2 new comments -
torch.load with weights_only=True to support pickle protocol 3/4/5
#118166 commented on
Apr 25, 2024 • 2 new comments -
Revisit security implications of #31875
#111806 commented on
Apr 27, 2024 • 2 new comments -
[Environment Variable][1/N] Use thread-safe env variable API in c10
#119449 commented on
Apr 24, 2024 • 2 new comments -
sdp::SDPBackend::flash_attention support PrivateUse1
#124368 commented on
Apr 30, 2024 • 2 new comments -
DISABLED test_isolated_node (__main__.TestAutogradWithCompiledAutograd)
#124460 commented on
Apr 26, 2024 • 2 new comments -
[inductor] switch assume_aligned_inputs to False
#124336 commented on
Apr 29, 2024 • 2 new comments -
Torch compile does not work on python 3.12
#120233 commented on
Apr 26, 2024 • 2 new comments -
Investigate Strictness of torch.compile `is_big_gpu`
#109489 commented on
Apr 26, 2024 • 2 new comments -
Should be able to query a schema for HOPs
#119592 commented on
Apr 26, 2024 • 2 new comments -
CUBLAS_STATUS_EXECUTION_FAILED when calling cublasGemmEx
#124262 commented on
Apr 26, 2024 • 2 new comments -
Optionally use hipblaslt
#120551 commented on
Apr 26, 2024 • 2 new comments -
[Profiler] NCCL collectives text garbled and times reported in ns
#124102 commented on
Apr 24, 2024 • 2 new comments -
Ensure only builtins functions are wrapped in new frame for torch.compile
#124720 commented on
Apr 30, 2024 • 2 new comments -
Avoid using thrust:: directly, use THRUST_NS_QUALIFIER:: instead
#72582 commented on
Apr 30, 2024 • 2 new comments -
RX 6800 GPU reset when using ROCm in Stable Diffusion with Torch backend (not sure if relevant)
#120775 commented on
Apr 30, 2024 • 2 new comments -
GroupNorm & InstanceNorm does not handle channels_last correctly
#111824 commented on
Apr 29, 2024 • 2 new comments -
Update CUDA out of memory mesage with private pool info
#124673 commented on
Apr 24, 2024 • 2 new comments -
Fails to compile with nvidia-cuda-toolkit-12.4.0
#122169 commented on
Apr 30, 2024 • 2 new comments -
DISABLED test_multi_backward (__main__.TestAutogradWithCompiledAutograd)
#124491 commented on
Apr 30, 2024 • 2 new comments -
Unbacked SymInts: Should backwards graph with unbacked SymInts be recompiled with hints
#124686 commented on
Apr 23, 2024 • 2 new comments -
ROCm & Windows Support
#106608 commented on
Apr 29, 2024 • 2 new comments -
torch.compile does not work since 2.2.1 on MacOS for some models
#124497 commented on
Apr 23, 2024 • 2 new comments -
TypeError: unhashable type: non-singleton SymInt in AOTAutograd merge_view_inputs
#114366 commented on
Apr 29, 2024 • 2 new comments -
Correct error message for aten::_local_scalar_dense on meta tensor
#124554 commented on
Apr 24, 2024 • 2 new comments -
benchmark.Compare raises: TypeError: object of type 'NoneType' has no len()
#63971 commented on
Apr 29, 2024 • 2 new comments -
[RFC] PyTorch DistributedTensor
#88838 commented on
Apr 29, 2024 • 2 new comments -
[triton hash update] update the pinned triton hash
#115529 commented on
Apr 29, 2024 • 2 new comments -
Fix common_utils's retry decorator, add run_tests call to test_hub
#116067 commented on
Apr 30, 2024 • 2 new comments -
DataLoader num_workers > 0 causes CPU memory from parent process to be replicated in all worker processes
#13246 commented on
Apr 28, 2024 • 2 new comments -
[2/N] Non-Tensor: Scalar Support: Add scalar to the cache for eager-through-torch.compile
#124070 commented on
Apr 28, 2024 • 2 new comments -
input.is_sparse() INTERNAL ASSERT FAILED
#120989 commented on
Apr 24, 2024 • 2 new comments -
No factory functions for strided quantized tensors
#74540 commented on
Apr 25, 2024 • 2 new comments -
scatter_reduce method do not support complex number multiplication on CUDA
#121965 commented on
Apr 24, 2024 • 2 new comments -
Placing LSTM model on bfloat16 on GPU causes error
#88136 commented on
Apr 25, 2024 • 2 new comments -
Add 2nd shard to ROCm trunk workflow for core distributed UTs
#121716 commented on
Apr 25, 2024 • 2 new comments -
`torch.distributed` hangs when using `torch.distributed.barrier` before any other communication primitives.
#124714 commented on
Apr 25, 2024 • 2 new comments -
[Inductor][Quant] Change the QConv output scale name
#124246 commented on
Apr 30, 2024 • 2 new comments -
upstream `apex.normalization.FusedRMSNorm`
#72643 commented on
Apr 25, 2024 • 2 new comments -
Grad strides do not match bucket view strides.
#47163 commented on
Apr 25, 2024 • 2 new comments -
Improving format of communication metadata in PyTorch Execution Trace.
#124674 commented on
Apr 25, 2024 • 2 new comments -
Grad strides do not match bucket view strides
#83909 commented on
Apr 25, 2024 • 2 new comments -
Batching rule for `aten::_scaled_dot_product_efficient_attention`
#102457 commented on
Apr 25, 2024 • 2 new comments -
[Performance] Potential Performance optimization for SDPA
#100270 commented on
Apr 26, 2024 • 2 new comments -
Fixed an undefined combination of inputs for torch.fmod.
#120624 commented on
Apr 27, 2024 • 2 new comments -
make tensor data const correct
#97856 commented on
Apr 26, 2024 • 2 new comments -
profiler.export_stacks doesn't return stack trace unless experimental_config is provided
#100253 commented on
Apr 30, 2024 • 1 new comment -
Clang tidy torch csrc16
#120573 commented on
Apr 25, 2024 • 1 new comment -
Skip fx passes for split-cat with Node dims
#124629 commented on
Apr 30, 2024 • 1 new comment -
Tensor.nonzero fails on GPU for tensors containing more than INT_MAX elements
#51871 commented on
Apr 30, 2024 • 1 new comment -
Massive initial memory overhead GPU
#12873 commented on
Apr 30, 2024 • 1 new comment -
Request for deterministic support for reflection_pad2d_backward_cuda
#98925 commented on
Apr 30, 2024 • 1 new comment -
Prefer construction via DLPack to costly element-by-element copy
#120615 commented on
Apr 28, 2024 • 1 new comment -
Upgrade submodule oneDNN to v3.4
#122472 commented on
Apr 30, 2024 • 1 new comment -
RuntimeError: reflection_pad2d_backward_cuda does not have a deterministic implementation
#123843 commented on
Apr 30, 2024 • 1 new comment -
Fix a check message in pickler
#120701 commented on
Apr 27, 2024 • 1 new comment -
While loop autograd
#124573 commented on
Apr 26, 2024 • 1 new comment -
functionalize storage resizing, minimal ppFSDP traceable forward
#122434 commented on
Apr 25, 2024 • 1 new comment -
[dtensor] from_local broadcast use functional collective
#120457 commented on
Apr 24, 2024 • 1 new comment -
Remove dtype check on meta device
#120634 commented on
Apr 26, 2024 • 1 new comment -
Connection closed by peer when using dist.isend in gloo backend
#75512 commented on
Apr 30, 2024 • 1 new comment -
[feature request] Rank-Revealing QR - Adding dgeqp3 support to torch.qr
#10454 commented on
Apr 30, 2024 • 1 new comment -
Some fixups
#124658 commented on
Apr 24, 2024 • 1 new comment -
torch native functions cannot be used with inspect.signature
#28233 commented on
Apr 30, 2024 • 1 new comment -
Make CI less noisy
#124664 commented on
Apr 29, 2024 • 1 new comment -
Cannot re-initialize CUDA in forked subprocess
#40403 commented on
Apr 30, 2024 • 1 new comment -
[Quant][Inductor] Enable lowering of qlinear-binary(-unary) fusion for X86Inductor
#122593 commented on
Apr 29, 2024 • 1 new comment -
Fix DDP no_sync when find_unused_parameters is True
#124193 commented on
Apr 26, 2024 • 1 new comment -
DISABLED test_index (__main__.TestPythonBuiltinOP)
#119160 commented on
Apr 30, 2024 • 1 new comment -
torch.utils.cpp_extension.load recompiling every time
#124454 commented on
Apr 30, 2024 • 1 new comment -
Fix absolute links in pytorch repository and allow it to be proxied
#101798 commented on
Apr 30, 2024 • 1 new comment -
[dynamo] Function => FunctionCtx for placeholder obj
#120577 commented on
Apr 29, 2024 • 1 new comment -
[1/N] Non-Tensor: Scalar Support: Enable aot compile to support aten operations with scalar input like alpha
#124177 commented on
Apr 28, 2024 • 1 new comment -
[tensor] Replace raw loops with std::reduce for size calc.
#120580 commented on
Apr 25, 2024 • 1 new comment -
Understand the oneDNN graph fusion with torch script
#124458 commented on
Apr 30, 2024 • 1 new comment -
[inductor] add cpp builder code.
#124045 commented on
Apr 30, 2024 • 1 new comment -
Prevent cuda:0 context initialization when working on another cuda device
#124722 commented on
Apr 24, 2024 • 1 new comment -
Set simdlen based on ATEN_CPU_CAPABILITY
#123514 commented on
Apr 30, 2024 • 1 new comment -
[WIP] support map impl in pre-dispatch IR
#120159 commented on
Apr 30, 2024 • 1 new comment -
[executorch] Add support for method variant functions in ExecuTorch codegen
#120840 commented on
Apr 30, 2024 • 1 new comment -
Fix dynamo issue "Failed running call_function <built-in method sparse_coo_tensor of type object at 0xDEADBEEF"
#118192 commented on
Apr 28, 2024 • 1 new comment -
ProcessGroupWrapper support custom backend
#124447 commented on
Apr 28, 2024 • 1 new comment -
[dynamo] fix compiling Dataclass construction with default_factory
#120827 commented on
Apr 29, 2024 • 1 new comment -
[MPS] Add support for max_unpool2d
#118665 commented on
Apr 26, 2024 • 1 new comment -
[ROCm] hipSPARSELt Integration
#124320 commented on
Apr 30, 2024 • 1 new comment -
[ROCm] amdsmi library integration
#119182 commented on
Apr 24, 2024 • 1 new comment -
[typing] Rename argument of `nn.Sequential.forward` from `input` to `__input`
#119209 commented on
Apr 28, 2024 • 1 new comment -
Fix missing parameter check in at::batch_norm
#119361 commented on
Apr 29, 2024 • 1 new comment -
Fix stream type to generic in comms default hooks
#120069 commented on
Apr 25, 2024 • 1 new comment -
[Distributed] Add P2P versions of *object_list operations
#124379 commented on
Apr 26, 2024 • 1 new comment -
[Inductor][AMD] Enable pipeliner for Gemm
#120637 commented on
Apr 26, 2024 • 1 new comment -
[Don't merge] Refactor device bound check for xpu code
#120768 commented on
Apr 29, 2024 • 1 new comment -
[fbcode] Upstream parallel fast cat on cpu in OSS cat op
#120753 commented on
Apr 28, 2024 • 1 new comment -
Update expecttest in conda env
#120711 commented on
Apr 27, 2024 • 1 new comment -
DISABLED test_mm_batching (__main__.TestScript)
#119747 commented on
Apr 30, 2024 • 1 new comment -
Add default values to PyTorchMemEffAttention::AttentionKernel::Params members
#112215 commented on
Apr 30, 2024 • 1 new comment -
[sym_shapes][perf] Optimize bound_sympy avoiding sympy equals
#124211 commented on
Apr 23, 2024 • 1 new comment -
Fix `as_strided` functionalization for lazy backend.
#120435 commented on
Apr 28, 2024 • 1 new comment -
[dynamo] Handle np.iinfo/finfo/dtype as input
#124482 commented on
Apr 24, 2024 • 1 new comment -
Add dist hooks support for custom device
#114730 commented on
Apr 27, 2024 • 1 new comment -
Add back non standard shapes test samples for SDPA in common_methods_…
#115464 commented on
Apr 26, 2024 • 1 new comment -
[FAILURE] quantized test
#120941 commented on
Apr 30, 2024 • 1 new comment -
Enable test_embedding_bag_device_* with PYTORCH_TEST_WITH_DYNAMO
#120884 commented on
Apr 29, 2024 • 1 new comment -
Add _to_copy op for jagged NT
#115749 commented on
Apr 29, 2024 • 1 new comment -
[dynamo] fix silent incorrectness caused by variable tracker caching
#120861 commented on
Apr 29, 2024 • 1 new comment -
Some update
#124450 commented on
Apr 29, 2024 • 1 new comment -
[not4land] Batch norm consolidation disable xfails/skips
#120844 commented on
Apr 29, 2024 • 1 new comment -
Add `torch._dynamo.is_fullgraph_compiling` to allow different codepath depending on fullgraph tracing
#120400 commented on
Apr 26, 2024 • 1 new comment -
[WIP] inductor use rand4x
#117125 commented on
Apr 30, 2024 • 1 new comment -
Hacks to work around the fact that ScriptMethod does not have code/signature
#124449 commented on
Apr 29, 2024 • 1 new comment -
[inductor][cpu]GPT2ForSequenceClassification AMP static/dynamic shape default/cpp wrapper single thread accuracy crash
#123503 commented on
Apr 24, 2024 • 1 new comment -
Package manager install on Nvidia Grace Hopper does not make cuda available
#123835 commented on
Apr 24, 2024 • 1 new comment -
HOP dispatch isn't faithful
#124484 commented on
Apr 24, 2024 • 1 new comment -
Placeholder tensor is empty!
#123171 commented on
Apr 24, 2024 • 1 new comment -
Dynamo Export: Support for PixelShuffle
#124338 commented on
Apr 24, 2024 • 1 new comment -
arm64-v8a not compiling due to libpytorch_jni.so
#51020 commented on
Apr 24, 2024 • 1 new comment -
torch.compiler.disable doesn't disable nested functions (also doesn't work as a context manager)
#123771 commented on
Apr 24, 2024 • 1 new comment -
c10::CUDAError
#67978 commented on
Apr 25, 2024 • 1 new comment -
Batch size is hardcoded using torch.jit.trace with LSTMCell
#59530 commented on
Apr 25, 2024 • 1 new comment -
NCCL error of PyTorch 2.1.0 when using multiple gpus
#113245 commented on
Apr 25, 2024 • 1 new comment -
Doesn't work when register hook to torch.nn.MultiheadAttention.out_proj
#78109 commented on
Apr 25, 2024 • 1 new comment -
jit.freeze throws RuntimeError: stack_out && stack_out->size() == 1 INTERNAL ASSERT FAILED at "../torch/csrc/jit/passes/frozen_conv_folding.cpp":281
#80861 commented on
Apr 25, 2024 • 1 new comment -
calling nn.utils.parametrize inside torch.compile leads to error
#115744 commented on
Apr 25, 2024 • 1 new comment -
torch._dynamo.exc.Unsupported: call_function args: UserDefinedObjectVariable(EasyDict)
#120219 commented on
Apr 25, 2024 • 1 new comment -
Update test_cuda.py and test_torch.py optim tests to use OptimizerInfo and optim_db
#123451 commented on
Apr 25, 2024 • 1 new comment -
PyTorch 2.0.0 encountered CUDA error: an illegal memory access was encountered
#99372 commented on
Apr 25, 2024 • 1 new comment -
Transformer Engine Checkpointing Broken on Torch 2.3
#122946 commented on
Apr 25, 2024 • 1 new comment -
[inductor][cpu]pyhpc_turbulent_kinetic_energy AMP multithread static/dynamic shape default/cpp wrapper performance regression
#123801 commented on
Apr 26, 2024 • 1 new comment -
orch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::unflatten' to ONNX opset version 12 is not supported.
#124716 commented on
Apr 26, 2024 • 1 new comment -
masked_fill supports PrivateUse1, when value.device.type is cpu
#124693 commented on
Apr 26, 2024 • 1 new comment -
Discrepancy between CPU->GPU and GPU->CPU data transfer speeds
#52718 commented on
Apr 26, 2024 • 1 new comment -
Advanced indexing with uint8 tensor versus int64 tensor is inconsistent
#20149 commented on
Apr 26, 2024 • 1 new comment -
Fused Linear and Cross-Entropy Loss `torch.nn.functional.linear_cross_entropy`
#124480 commented on
Apr 26, 2024 • 1 new comment -
DISABLED test_inplace_on_view_weak_grad_fn (__main__.TestAutogradWithCompiledAutograd)
#124453 commented on
Apr 26, 2024 • 1 new comment -
Unexpected modification to CPU affinity of Dataloader workers
#101850 commented on
Apr 26, 2024 • 1 new comment -
fusion in fx graph mode did not take care of direct attribute access
#68892 commented on
Apr 26, 2024 • 1 new comment -
DTensor + compile error's during backward when output is non-contiguous
#118219 commented on
Apr 23, 2024 • 1 new comment -
Significant performance degradation with multiprocessing in PyTorch 2.x compared to 1.13.1
#122626 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_aot_export_module_joint (__main__.TestAOTExport)
#124166 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_source_multithreaded_complex_work_in_main_thread_True (__main__.TestProfiler)
#119536 commented on
Apr 23, 2024 • 1 new comment -
torch_dispatch has unfaithful behavior w.r.t. wrapped numbers
#124731 commented on
Apr 23, 2024 • 1 new comment -
Importing polars before torch causes a segfault
#124656 commented on
Apr 23, 2024 • 1 new comment -
[torch.compile] torch._dynamo.exc.TorchRuntimeError: Failed running call_function <method 'numpy' of 'torch._C.TensorBase' objects>(*(FakeTensor(..., size=(32, 3, 64, 64)),), **{})
#124247 commented on
Apr 23, 2024 • 1 new comment -
`test_scatter_bf16_cuda` fails on V100
#118581 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_sparse_tensors (__main__.TestTorchTidyProfiler)
#124253 commented on
Apr 23, 2024 • 1 new comment -
Find a common home for decompositions, perhaps outside of the obliquely named _refs directory
#124427 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_save_on_cpu_and_checkpoint (__main__.TestAutogradWithCompiledAutograd)
#124706 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_forward_mode_AD_linalg_lu_cuda_float64 (__main__.TestFwdGradientsCUDA)
#86774 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_saved_variable_packing_unpacking_did_not_save_original_with_default_hooks (__main__.TestAutogradWithCompiledAutograd)
#124733 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_saved_tensor_hooks_custom_function_intermediates (__main__.TestAutogradWithCompiledAutograd)
#124723 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_inplace (__main__.TestAutogradWithCompiledAutograd)
#124446 commented on
Apr 23, 2024 • 1 new comment -
Add BufferDict container
#37386 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_source_multithreaded_multiple_preexisting_work_in_main_thread_True (__main__.TestProfiler)
#119576 commented on
Apr 23, 2024 • 1 new comment -
DISABLED test_source_multithreaded_open_in_scope_work_in_main_thread_True (__main__.TestProfiler)
#119668 commented on
Apr 23, 2024 • 1 new comment -
aot_export_joint_simple on plain callable (not graph module) doesn't attach stack traces
#102205 commented on
Apr 23, 2024 • 1 new comment -
MPS memory leak in training
#121113 commented on
Apr 24, 2024 • 1 new comment -
[nightly][jit] bad constant exponent (e+38.f) in default_program fused_mul_div_add
#107503 commented on
Apr 24, 2024 • 1 new comment -
Change `GradScaler` to respect an existing `grad_scale` value.
#123428 commented on
Apr 24, 2024 • 1 new comment -
CudaHostAlloc takes a lot of time during training
#124456 commented on
Apr 24, 2024 • 1 new comment -
RuntimeError: derivative for aten::_scaled_dot_product_efficient_attention_backward is not implemented
#117974 commented on
Apr 24, 2024 • 1 new comment -
ImportError `undefined symbol: iJIT_NotifyEvent` encountered when MKL 2024.1 is installed.
#123097 commented on
Apr 24, 2024 • 1 new comment -
[functorch] transforms like jacrev, jacfwd, grad, etc don't work with BatchNorm
#85533 commented on
Apr 24, 2024 • 1 new comment -
Export swallows exception
#111075 commented on
Apr 27, 2024 • 1 new comment -
Multi Scale Deformable Attention Support
#112827 commented on
Apr 27, 2024 • 1 new comment -
FSDP crashes when submodule calls method that isn't `forward()`
#109385 commented on
Apr 28, 2024 • 1 new comment -
[inductor][cpu] FP32/AMP models multiple/single thread static/dynamic shape default/CPP wrapper accuracy crash in 2024-04-14 nightly release
#124286 commented on
Apr 28, 2024 • 1 new comment -
Custom ROCm hip and C++ extensions (replicated from pytorch/tutorials)
#119429 commented on
Apr 29, 2024 • 1 new comment -
Label tracking meta-issue (edit me to get automatically CC'ed on issues! cc bot)
#24422 commented on
Apr 29, 2024 • 1 new comment -
ONNX export is unnecessarily slow (O(N^2))
#121422 commented on
Apr 28, 2024 • 1 new comment -
[RFC] PyTorch next wheel build platform: manylinux-2.28
#123649 commented on
Apr 28, 2024 • 1 new comment -
vec_test_all_types_xxx with dtype c10::complex<float> and c10::complex<double> has failures on division
#104516 commented on
Apr 29, 2024 • 1 new comment -
ImportError: libcudnn.so.8: cannot open shared object file: No such file or directory
#104259 commented on
Apr 29, 2024 • 1 new comment -
RecursionError when running torch.jit.script inside JitTestCase
#76881 commented on
Apr 29, 2024 • 1 new comment -
Add a requirements.txt for windows pip packages
#103354 commented on
Apr 29, 2024 • 1 new comment -
PyTorch Memory Management in GPU-to-CPU Transfers issue
#124487 commented on
Apr 29, 2024 • 1 new comment -
Compile doesn't guard on user NN module attribute
#124717 commented on
Apr 29, 2024 • 1 new comment -
Error: IndexError: map::at When using torch.distributed.all_reduce(tensor)
#116393 commented on
Apr 29, 2024 • 1 new comment -
IndexError: map::at with MPI CUDA collectives
#114040 commented on
Apr 29, 2024 • 1 new comment -
[feature request] np.packbits / np.unpackbits, general BitTensors (maybe can be just tensors with dtype torch.bits8 or have a new dtype torch.bits introduced) and bit packed tensors utilities for saving memory / accesses, support for BitTensors wherever BoolTensors are used
#32867 commented on
Apr 29, 2024 • 1 new comment -
DISABLED test_leaf_assignment (__main__.TestAutogradWithCompiledAutograd)
#124405 commented on
Apr 29, 2024 • 1 new comment -
[FX] Ability to wrap functions in other modules for symbolic tracing
#53534 commented on
Apr 29, 2024 • 1 new comment -
switch more test cases to use MultithreadTestCase
#108744 commented on
Apr 27, 2024 • 1 new comment -
Support using SymBool in arithmetics
#110738 commented on
Apr 27, 2024 • 1 new comment -
Loading traced pytorch model to C++
#124009 commented on
Apr 27, 2024 • 1 new comment -
[RFC] Dynamo Single Step Graph
#117394 commented on
Apr 29, 2024 • 1 new comment -
large model, low memory: need `torch.load` that loads one submodule at a time
#75242 commented on
Apr 29, 2024 • 1 new comment -
torch.onnx: operator 'aten::unflatten' to ONNX is not supported.
#121301 commented on
Apr 26, 2024 • 1 new comment -
Improve behaviour of `torch.linalg.lstsq` on CUDA GPU for rank defficient matrices
#117122 commented on
Apr 27, 2024 • 1 new comment -
Libtorch crashes docker when included in header file
#124197 commented on
Apr 27, 2024 • 1 new comment -
Segfault in TCPStore and FileStore compare_set()
#123983 commented on
Apr 29, 2024 • 1 new comment -
[Meta Tensor] Inplace set storage of meta tensor will alter the storage's nbytes if meta tensor's nbytes is smaller
#123879 commented on
Apr 29, 2024 • 1 new comment -
DISABLED test_mark_non_differentiable_none (__main__.TestAutogradWithCompiledAutograd)
#124475 commented on
Apr 30, 2024 • 1 new comment -
Registering function that takes `const SymInt&` to op that accepts `SymInt` leads to cryptic error
#124645 commented on
Apr 23, 2024 • 0 new comments -
DISABLED test_default_partitioner_output_tensor_shape_tensor (__main__.TestPartitioning)
#124355 commented on
Apr 29, 2024 • 0 new comments -
DISABLED test_contiguous (__main__.TestPartitioning)
#124323 commented on
Apr 29, 2024 • 0 new comments -
DISABLED test_default_partitioner_getitem (__main__.TestPartitioning)
#124278 commented on
Apr 29, 2024 • 0 new comments -
DISABLED test_aot_export_simplified_basic (__main__.TestAOTExport)
#124254 commented on
Apr 29, 2024 • 0 new comments -
DISABLED test_aot_export_multiple_outputs_require_grad_banned (__main__.TestAOTExport)
#124221 commented on
Apr 29, 2024 • 0 new comments -
Conv with permutation on MPS will lead to negative MSE loss
#124621 commented on
Apr 24, 2024 • 0 new comments -
mps bug: failed assertion `[MPSNDArrayDescriptor sliceDimension:withSubrange:] error: subRange.start (6) is not less than length of dimension[0] (6)'
#96153 commented on
Apr 24, 2024 • 0 new comments -
[inductor][cpu]DebertaV2ForQuestionAnswering AMP static/dynamic shape multiple thread default wrapper regression
#122390 commented on
Apr 24, 2024 • 0 new comments -
torch.quantile on MPS doesn't sort values when dim is not None
#101878 commented on
Apr 24, 2024 • 0 new comments -
Foreach tests should xfail on all dtypes that are not supported
#124726 commented on
Apr 23, 2024 • 0 new comments -
[DONOTREVIEW][DTenosr][Test] DTensor 2D sharding
#124339 commented on
Apr 23, 2024 • 0 new comments -
[inductor][cpu]adv_inception_v3, gluon_inception_v3 and inception_v3 AMP performance regression
#122393 commented on
Apr 24, 2024 • 0 new comments -
squash of flight_5 vs flightbase
#124229 commented on
Apr 25, 2024 • 0 new comments -
torch._export can't export resnet50 model
#124595 commented on
Apr 23, 2024 • 0 new comments -
flight51 squashed vs flightbase
#124236 commented on
Apr 25, 2024 • 0 new comments -
Automated submodule update: kineto
#106149 commented on
Apr 30, 2024 • 0 new comments -
fix a typo in the householder_product docs
#124279 commented on
Apr 30, 2024 • 0 new comments -
[Tracker] torch.sparse semi-structured 2.3 beta release
#115662 commented on
Apr 24, 2024 • 0 new comments -
[feature request] Caching allocator diagnostics and memory allocation tracing/visualization
#1529 commented on
Apr 24, 2024 • 0 new comments -
Dynamo-based ONNX Export: Failed to produce a graph during tracing as no tensor operations were found.
#123973 commented on
Apr 23, 2024 • 0 new comments -
DISABLED test_refcounts (__main__.TestTorchTidyProfiler)
#124220 commented on
Apr 29, 2024 • 0 new comments -
`torch.func.functional_call` doesn't work with compiled models
#97909 commented on
Apr 23, 2024 • 0 new comments -
[TESTING] Don't clamp upper to 2
#124631 commented on
Apr 23, 2024 • 0 new comments -
DISABLED test_binary_op_list_error_cases__foreach_add_cuda_int16 (__main__.TestForeachCUDA)
#124636 commented on
Apr 23, 2024 • 0 new comments -
squash of flight_5.3 vs flightbase
#124672 commented on
Apr 25, 2024 • 0 new comments -
DISABLED test_tensor_subclasses (__main__.TestScript)
#119949 commented on
Apr 23, 2024 • 0 new comments -
Output Discrepancy between PyTorch Model and Converted ONNX Model
#124711 commented on
Apr 23, 2024 • 0 new comments -
torch._dynamo.allow_in_graph seems to silently no-op on staticmethods
#124735 commented on
Apr 23, 2024 • 0 new comments -
DISABLED test_resnet18_backward_trace_cpu (__main__.TestPythonKeyCPU)
#124641 commented on
Apr 29, 2024 • 0 new comments -
Updated test_cuda.py optim tests to use OptimizerInfo
#124563 commented on
Apr 28, 2024 • 0 new comments -
[WIP][Inductor Intel GPU backend Upstream] Reuse inductor test for Intel GPU (PART 3)
#124702 commented on
Apr 27, 2024 • 0 new comments -
inductor creates unnecessary buffers
#124653 commented on
Apr 23, 2024 • 0 new comments -
RFC: Turn on no-undefined
#124545 commented on
Apr 30, 2024 • 0 new comments -
DISABLED test_aot_module_simplified_preserves_stack_trace (__main__.TestAOTModuleSimplified)
#124609 commented on
Apr 29, 2024 • 0 new comments -
Stable Diffusion Model Error: torch._dynamo.exc.InternalTorchDynamoError: raw
#124477 commented on
Apr 23, 2024 • 0 new comments -
[supermodules] Remove all supermodule labels
#124521 commented on
Apr 25, 2024 • 0 new comments -
DISABLED test_aot_module_simplified_fake_tensor_gm_raises (__main__.TestAOTModuleSimplified)
#124590 commented on
Apr 29, 2024 • 0 new comments -
[minimizer] Add exclusion function to minimizer base
#124504 commented on
Apr 30, 2024 • 0 new comments -
[dynamo] Allow inlining of hooks for the top module
#124501 commented on
Apr 30, 2024 • 0 new comments -
[dynamo] Support ndarray.dtype attribute access
#124490 commented on
Apr 24, 2024 • 0 new comments -
fix torch.compile with triton kernels under inference_mode
#124489 commented on
Apr 26, 2024 • 0 new comments -
DISABLED test_aot_module_simplified_dynamic (__main__.TestAOTModuleSimplified)
#124510 commented on
Apr 29, 2024 • 0 new comments -
DISABLED test_aot_module_simplified (__main__.TestAOTModuleSimplified)
#124476 commented on
Apr 29, 2024 • 0 new comments -
DISABLED test_aot_dispatch_incorrect_backward (__main__.TestAOTDispatch)
#124459 commented on
Apr 29, 2024 • 0 new comments -
[WIP] [Inductor Intel GPU backend Upstream] Reuse inductor test for Intel GPU (PART 2)
#124147 commented on
Apr 27, 2024 • 0 new comments -
[codemod][lowrisk] Remove extra semi colon from caffe2/c10/core/SymNodeImpl.h
#123055 commented on
Apr 27, 2024 • 0 new comments -
[FSDP2] Eager-Mode Execution Tracker
#120003 commented on
Apr 29, 2024 • 0 new comments -
[onnx.export] Avoid linear loop over symbol_dim_map
#123029 commented on
Apr 25, 2024 • 0 new comments -
FlexAttention isn't using decompositions
#124643 commented on
Apr 24, 2024 • 0 new comments -
Add Gaudi support to benchmarks/dynamo/* benchmark.
#122960 commented on
Apr 24, 2024 • 0 new comments -
[WIP][Inductor Intel GPU backend Upstream] Reuse inductor test for Intel GPU (PART 1)
#122866 commented on
Apr 27, 2024 • 0 new comments -
Make c10::Error empty backtrace as an optional argument
#122611 commented on
Apr 26, 2024 • 0 new comments -
[typing] Rename argument of `nn.Sequential.forward` from `input` to `__input`
#119208 commented on
Apr 28, 2024 • 0 new comments -
Avoid always building stack trace strings in c10::Error
#122086 commented on
Apr 26, 2024 • 0 new comments -
[WIP] Arm64 Enablement
#117274 commented on
Apr 27, 2024 • 0 new comments -
[FSDP] Use generic device handle instead of cuda
#121620 commented on
Apr 24, 2024 • 0 new comments -
[Inductor Cutlass backend] DO NOT REVIEW - to be split up
#121492 commented on
Apr 25, 2024 • 0 new comments -
custom ops should have needs_fixed_stride_order by default
#124647 commented on
Apr 25, 2024 • 0 new comments -
Conflict between ``torch.func`` transformations and ``torch.jit.trace``
#98724 commented on
Apr 25, 2024 • 0 new comments -
Add `ciflow/inductor` for test only changes
#118206 commented on
Apr 23, 2024 • 0 new comments -
lintrunner should fail on badly formatted docstrings
#102227 commented on
Apr 25, 2024 • 0 new comments -
Bfloat16 tensor .numpy() support
#90574 commented on
Apr 25, 2024 • 0 new comments -
[ONNX] stft export fails with dynamo_export
#113067 commented on
Apr 25, 2024 • 0 new comments -
No module named 'caffe2' when using `add_scalar` with string
#119195 commented on
Apr 25, 2024 • 0 new comments -
[dynamo] Validate check_fn
#118448 commented on
Apr 25, 2024 • 0 new comments -
CUDAGraph Tree TORCH_CHECK failed when NCCL operator exists.
#124391 commented on
Apr 26, 2024 • 0 new comments -
[PT2] Return int32 indices in max_pool2d_with_indices
#103785 commented on
Apr 26, 2024 • 0 new comments -
torch.normal ignores default_device
#122886 commented on
Apr 26, 2024 • 0 new comments -
DISABLED test_non_contiguous_tensors_nn_ConvTranspose1d_cuda_complex32 (__main__.TestModuleCUDA)
#81732 commented on
Apr 26, 2024 • 0 new comments -
Switch batch norm stack to consolidated ops
#119496 commented on
Apr 30, 2024 • 0 new comments -
S390x binaries
#120398 commented on
Apr 27, 2024 • 0 new comments -
[FSDP] Removed clamp to `NO_SHARD` for world size 1
#120334 commented on
Apr 24, 2024 • 0 new comments -
Fix implicit fallthroughs where it is simple to do so in caffe2/
#119700 commented on
Apr 27, 2024 • 0 new comments -
[FSDP] Add device in pin_memory argument
#119878 commented on
Apr 24, 2024 • 0 new comments -
[CI] CPU Inductor codepath for AVX2/Default is not tested in CI
#123224 commented on
Apr 24, 2024 • 0 new comments -
[inductor] Enable fx graph caching by default
#124091 commented on
Apr 30, 2024 • 0 new comments -
cpu performance for int4mm kernels
#122813 commented on
Apr 24, 2024 • 0 new comments -
[WIP][inductor] refine loop split logic
#124060 commented on
Apr 30, 2024 • 0 new comments -
Inconsistent results when training a model containing SyncBatchNorm with multiple GPUs
#124680 commented on
Apr 24, 2024 • 0 new comments -
[Inductor] [Quant] Enable lowering of quant per tensor and refactor quant pattern
#124041 commented on
Apr 30, 2024 • 0 new comments -
Add scaled_dot_product_attention "scale" argument to nn.MultiHeadAttention
#124718 commented on
Apr 24, 2024 • 0 new comments -
Migrating from setup.py install/develop to leverage pip standards
#124027 commented on
Apr 28, 2024 • 0 new comments -
[inductor][cpp] GEMM template
#124021 commented on
Apr 30, 2024 • 0 new comments -
Questions about parameter initialization, especially with torch.bfloat16 precision
#124719 commented on
Apr 24, 2024 • 0 new comments -
Fix constant propagation pass
#114471 commented on
Apr 26, 2024 • 0 new comments -
[discussion] Route pointwise Conv1d/Conv2d to matmul? (also in eager)
#116506 commented on
Apr 24, 2024 • 0 new comments -
Initial LR Scheduler composability tests
#123753 commented on
Apr 25, 2024 • 0 new comments -
Fix user warning for tensor LR
#123752 commented on
Apr 25, 2024 • 0 new comments -
Swap warning counter to flag in LRScheduler
#123751 commented on
Apr 25, 2024 • 0 new comments -
Add decomposition for slice_scatter
#123744 commented on
Apr 26, 2024 • 0 new comments -
Dynamo x autograd.Function: graph breaks on all the staticmethods on autograd.Function
#118397 commented on
Apr 24, 2024 • 0 new comments -
[debug] a debug PR to test perf regression due to triton
#123694 commented on
Apr 30, 2024 • 0 new comments -
Dynamo x autograd.Function: graph breaks on freevars in forward
#118394 commented on
Apr 24, 2024 • 0 new comments -
Use _unsafe_masked_index in masked_scatter decomposition
#123667 commented on
Apr 26, 2024 • 0 new comments -
Improve decomposition for constand_pad_nd
#123661 commented on
Apr 26, 2024 • 0 new comments -
Dynamo x autograd.Function: silently ignores all of the ctx.methods
#118396 commented on
Apr 24, 2024 • 0 new comments -
Automated submodule update: FBGEMM
#115316 commented on
Apr 30, 2024 • 0 new comments -
Rename TorchDynamo -> Dyanamo in the dynamo tutorial doc
#123431 commented on
Apr 27, 2024 • 0 new comments -
Reenable dim for python 3.12
#123384 commented on
Apr 24, 2024 • 0 new comments -
[Dynamic Shapes] Fix error handling for indirectly fully constrained dynamic dimensions
#123293 commented on
Apr 30, 2024 • 0 new comments -
[CI] Node-20 update
#122115 commented on
Apr 24, 2024 • 0 new comments -
Add mode to MemoryDep to track atomic accumulates
#123223 commented on
Apr 26, 2024 • 0 new comments -
Decompositions for upsample linear backward
#123222 commented on
Apr 26, 2024 • 0 new comments