Note that changes from release candidates (e.g. v8.0.0-rc1, v8.0.0-rc2) are included/repeated in the final release (e.g. v8.0.0) change log.
The following are changes which have been implemented in the VTR master branch but have not yet been included in an official release.
-
Support for Advanced Architectures:
- 3D FPGA and RAD architectures.
- Architectures with hard Networks-on-Chip (NoCs).
- Distinct horizontal and vertical channel widths and types.
- Diagonal routing wires and other complex wire shapes (L-shaped, T-shaped, ....).
-
New Benchmark Suites:
- Koios: A deep-learning-focused benchmark suite with various design sizes.
- Hermes: Benchmarks utilizing hard NoCs.
- TitanNew: Large benchmarks targeting the Stratix 10 architecture.
-
Commercial FPGAs Architecture Captures:
- Intel’s Stratix 10 FPGA architecture.
- AMD’s 7-series FPGA architecture.
-
Parmys Logic Synthesis Flow:
- Better Verilog language coverage
- More efficient hard block mapping
-
VPR Graphics Visualizations:
- New interface for improved usability and underlying graphics rewritten using EZGL/GTK to allow more UI widgets.
- Algorithm breakpoint visualizations for placement and routing algorithm debugging.
- User-guided (manual) placement optimization features.
- Enabled a live connection for client graphical application to VTR engines through sockets (server mode).
- Interactive timing path analysis (IPA) client using server mode.
-
Performance Enhancements:
- Parallel router for faster inter-cluster routing or flat routing.
-
Re-clustering API to modify packing decisions during the flow.
-
Support for floorplanning and placement constraints.
-
Unified intra- and inter-cluster (flat) routing.
-
Comprehensive web-based VTR utilities and API documentation.
- The default values of many command line options (e.g. inner_num is 0.5 instead of 1.0)
- Changes to placement engine
- Smart centroid initial placement algorithm.
- Multiple smart placement directed moves.
- Reinforcement learning-based placement algorithm.
- Changes to routing engine
- Faster lookahead creation.
- More accurate lookahead for large blocks.
- More efficient heap and pruning strategies.
- max
pres_fac
capped to avoid possible numeric issues.
- Many algorithmic and coding bugs are fixed in this release
- Breadth-first (non-timing-driven) router.
- Non-linear congestion placement cost.
- Support for arbitrary FPGA device grids/floorplans
- Support for clustered blocks with width > 1
- Customizable connection-block and switch-blocks patterns (controlled from FPGA architecture file)
- Fan-out dependent routing mux delays
- VPR can generate/load a routing architecture (routing resource graph) in XML format
- VPR can load routing from a
.route
file - VPR can performing analysis (STA/Power/Area) independently from optimization (via
vpr --analysis
) - VPR supports netlist primitives with multiple clocks
- VPR can perform hold-time (minimum delay) timing analysis
- Minimum delays can be annotated in the FPGA architecture file
- Flow supports formal verification of circuit implementation against input netlist
- Support for generating FASM to drive bitstream generators
- Routing predictor which predicts and aborts impossible routings early (saves significant run-time during minimum channel width search)
- Support for minimum routable channel width 'hints' (reduces minimum channel width search run-time if accurate)
- Improved VPR debugging/verbosity controls
- VPR can perform basic netlist cleaning (e.g. sweeping dangling logic)
- VPR graphics visualizations:
- Critical path during placement/routing
- Cluster pin utilization heatmap
- Routing utilization heatmap
- Routing resource cost heatmaps
- Placement macros
- VPR can route constant nets
- VPR can route clock nets
- VPR can load netlists in extended BLIF (eBLIF) format
- Support for generating post-placement timing reports
- Improved router 'map' lookahead which adapts to routing architecture structure
- Script to upgrade legacy architecture files (
vtr_flow/scripts/upgrade_arch.py
) - Support for Fc overrides which depend on both pin and target wire segment type
- Support for non-configurable switches (shorts, inline-buffers) used to model structures like clock-trees and non-linear wires (e.g. 'L' or 'T' shapes)
- Various other features since VTR 7
- VPR will exit with code 1 on errors (something went wrong), and code 2 when unable to implement a circuit (e.g. unroutable)
- VPR now gives more complete help about command-line options (
vpr -h
) - Improved a wide variety of error messages
- Improved STA timing reports (more details, clearer format)
- VPR now uses Tatum as its STA engine
- VPR now detects missmatched architecture (.xml) and implementation (.net/.place/.route) files more robustly
- Improved router run-time and quality through incremental re-routing and improved handling of high-fanout nets
- The timing edges within each netlist primitive must now be specified in the section of the architecture file
- All interconnect tags must have unique names in the architecture file
- Connection block input pin switch must now be specified in section of the architecture file
- Renamed switch types buffered/pass_trans to more descriptive tristate/pass_gate in architecture file
- Require longline segment types to have no switchblock/connectionblock specification
- Improve naming (true/false -> none/full/instance) and give more control over block pin equivalnce specifications
- VPR will produce a .route file even if the routing is illegal (aids debugging), however analysis results will not be produced unless
vpr --analsysis
is specified - VPR long arguments are now always prefixed by two dashes (e.g.
--route
) while short single-letter arguments are prefixed by a single dash (e.g.-h
) - Improved logic optimization through using a recent 2018 version of ABC and new synthesis script
- Significantly improved implementation quality (~14% smaller minimum routable channel widths, 32-42% reduced wirelength, 7-10% lower critical path delay)
- Significantly reduced run-time (~5.5-6.3x faster) and memory usage (~3.3-5x lower)
- Support for non-contiguous track numbers in externally loaded RR graphs
- Improved placer quality (reduced cost round-off)
- Various other changes since VTR 7
- FPGA Architecture file tags can be in arbitary orders
- SDC command arguments can be in arbitary orders
- Numerous other fixes since VTR 7
- Classic VPR timing analyzer
- IO channel distribution section of architecture file
- VPR's breadth-first router (use the timing-driven router, which provides supperiour QoR and Run-time)
- A docker image is available for VTR 8.0 release on mohamedelgammal/vtr8:latest. You can run it using the following commands:
$ sudo docker pull mohamedelgammal/vtr8:latest
$ sudo docker run -it mohamedelgammal/vtr8:latest
- Support for non-contiguous track numbers in externally loaded RR graphs
- Improved placer quality (reduced cost round-off)
- Support for arbitrary FPGA device grids/floorplans
- Support for clustered blocks with width > 1
- Customizable connection-block and switch-blocks patterns (controlled from FPGA architecture file)
- Fan-out dependent routing mux delays
- VPR can generate/load a routing architecture (routing resource graph) in XML format
- VPR can load routing from a
.route
file - VPR can performing analysis (STA/Power/Area) independently from optimization (via
vpr --analysis
) - VPR supports netlist primitives with multiple clocks
- VPR can perform hold-time (minimum delay) timing analysis
- Minimum delays can be annotated in the FPGA architecture file
- Flow supports formal verification of circuit implementation against input netlist
- Support for generating FASM to drive bitstream generators
- Routing predictor which predicts and aborts impossible routings early (saves significant run-time during minimum channel width search)
- Support for minimum routable channel width 'hints' (reduces minimum channel width search run-time if accurate)
- Improved VPR debugging/verbosity controls
- VPR can perform basic netlist cleaning (e.g. sweeping dangling logic)
- VPR graphics visualizations:
- Critical path during placement/routing
- Cluster pin utilization heatmap
- Routing utilization heatmap
- Routing resource cost heatmaps
- Placement macros
- VPR can route constant nets
- VPR can route clock nets
- VPR can load netlists in extended BLIF (eBLIF) format
- Support for generating post-placement timing reports
- Improved router 'map' lookahead which adapts to routing architecture structure
- Script to upgrade legacy architecture files (
vtr_flow/scripts/upgrade_arch.py
) - Support for Fc overrides which depend on both pin and target wire segment type
- Support for non-configurable switches (shorts, inline-buffers) used to model structures like clock-trees and non-linear wires (e.g. 'L' or 'T' shapes)
- Various other features since VTR 7
- VPR will exit with code 1 on errors (something went wrong), and code 2 when unable to implement a circuit (e.g. unroutable)
- VPR now gives more complete help about command-line options (
vpr -h
) - Improved a wide variety of error messages
- Improved STA timing reports (more details, clearer format)
- VPR now uses Tatum as its STA engine
- VPR now detects missmatched architecture (.xml) and implementation (.net/.place/.route) files more robustly
- Improved router run-time and quality through incremental re-routing and improved handling of high-fanout nets
- The timing edges within each netlist primitive must now be specified in the section of the architecture file
- All interconnect tags must have unique names in the architecture file
- Connection block input pin switch must now be specified in section of the architecture file
- Renamed switch types buffered/pass_trans to more descriptive tristate/pass_gate in architecture file
- Require longline segment types to have no switchblock/connectionblock specification
- Improve naming (true/false -> none/full/instance) and give more control over block pin equivalnce specifications
- VPR will produce a .route file even if the routing is illegal (aids debugging), however analysis results will not be produced unless
vpr --analsysis
is specified - VPR long arguments are now always prefixed by two dashes (e.g.
--route
) while short single-letter arguments are prefixed by a single dash (e.g.-h
) - Improved logic optimization through using a recent 2018 version of ABC and new synthesis script
- Significantly improved implementation quality (~14% smaller minimum routable channel widths, 32-42% reduced wirelength, 7-10% lower critical path delay)
- Significantly reduced run-time (~5.5-6.3x faster) and memory usage (~3.3-5x lower)
- Various other changes since VTR 7
- FPGA Architecture file tags can be in arbitary orders
- SDC command arguments can be in arbitary orders
- Numerous other fixes since VTR 7
- Classic VPR timing analyzer
- IO channel distribution section of architecture file