-
Notifications
You must be signed in to change notification settings - Fork 735
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add trace export to mixdim benchmark and fix FP16 benchmark consistency (#5665)
cla signed
fb-exported
meta-exported
#5665
opened Apr 20, 2026 by
q10
Contributor
Loading…
log query empty count vs total count
cla signed
fb-exported
meta-exported
#5657
opened Apr 17, 2026 by
xywang9334
Loading…
Fix VBE batch sizes not passed to request builder (#5653)
cla signed
fb-exported
meta-exported
#5653
opened Apr 17, 2026 by
gregmacnamara
Loading…
Triton/TLX IKBO FA (#5651)
cla signed
fb-exported
meta-exported
#5651
opened Apr 17, 2026 by
liptds
Loading…
Port merge_embeddings benchmark to tritonbench
cla signed
fb-exported
meta-exported
#5650
opened Apr 16, 2026 by
q10
Contributor
Loading…
Validate total_num_blocks divisibility by my_size in block_bucketize (#5646)
cla signed
fb-exported
meta-exported
#5649
opened Apr 16, 2026 by
q10
Contributor
Loading…
Fix bf16 rounding to IEEE 754 ties-to-even
cla signed
#5648
opened Apr 16, 2026 by
cyyever
Contributor
Loading…
Add CPU support in fbgemm for FloatToFP8RowwiseQuantized and FP8RowwiseQuantizedToFloat (#5644)
cla signed
fb-exported
meta-exported
#5644
opened Apr 15, 2026 by
djjatmeta
Loading…
Fix TBE v2 forward kernel for embedding dim > 1024 (#5326) (#5569)
cla signed
fb-exported
meta-exported
#5641
opened Apr 15, 2026 by
q10
Contributor
Loading…
Add UVM host-mapped memory support for dense TBE kernel (#5640)
cla signed
fb-exported
meta-exported
#5640
opened Apr 15, 2026 by
TroyGarden
Contributor
Loading…
Fix intra-warp and inter-warp race conditions in bounds_check_indices v1 and v2 CUDA kernels
cla signed
fb-exported
meta-exported
#5638
opened Apr 15, 2026 by
gchalump
Contributor
Loading…
Add missing async proxy fence
cla signed
fb-exported
meta-exported
#5637
opened Apr 15, 2026 by
lw
Contributor
Loading…
Add aligned_unique_ptr RAII wrapper to avoid leak risks (#5609)
cla signed
fb-exported
meta-exported
#5615
opened Apr 11, 2026 by
q10
Contributor
Loading…
Port batched_dense_vec_jagged_2d_mul and jagged_1d_to_truncated_values to tritonbench
cla signed
fb-exported
meta-exported
#5603
opened Apr 9, 2026 by
q10
Contributor
Loading…
Replace rocm-smi with amd-smi across ROCm build, CI, and docs
cla signed
module: rocm
#5597
opened Apr 8, 2026 by
adam360x
Loading…
3 tasks done
bf16 scale/bias for INT4 (#5595)
cla signed
fb-exported
meta-exported
#5595
opened Apr 8, 2026 by
jeetkanjani7
Loading…
Enable more clang-tidy checks on C++20 (#5575)
cla signed
fb-exported
meta-exported
module: rocm
#5588
opened Apr 7, 2026 by
q10
Contributor
Loading…
Add gflag to select feature names for SSD KV embedding table
cla signed
fb-exported
meta-exported
#5585
opened Apr 7, 2026 by
jnwan
Loading…
Split RowWiseSparseAdagradFused.cc.stripped.o from fbcode//admarket/adfinder:adfinder
cla signed
fb-exported
meta-exported
#5578
opened Apr 6, 2026 by
meta-codesync
bot
Loading…
Port expand_into_jagged_permute benchmark to tritonbench
cla signed
fb-exported
meta-exported
#5566
opened Apr 1, 2026 by
q10
Contributor
Loading…
Fix bash scripts to fail correctly for ROCm jobs (#5564)
ciflow/rocm-mi300
cla signed
fb-exported
meta-exported
module: rocm
#5564
opened Mar 31, 2026 by
q10
Contributor
Loading…
Add AMD/ROCm support for SSD TBE inference
cla signed
fb-exported
meta-exported
module: rocm
#5561
opened Mar 31, 2026 by
goldcoderZ
Contributor
Loading…
Add TurboSSDInferenceModule for HSTU serving integration
cla signed
fb-exported
meta-exported
#5560
opened Mar 31, 2026 by
goldcoderZ
Contributor
Loading…
2D weights support for permute_1D_data_kernel_vec
cla signed
fb-exported
meta-exported
#5557
opened Mar 31, 2026 by
kausv
Contributor
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.