Skip to content

WARP V Configuration Front End

Steve Hoover edited this page Feb 13, 2021 · 10 revisions

WARP-V Configurator Project Outline

This page outlines a project to create an online front-end for configuring WARP-V and deploying it using 1st CLaaS. (It is not clear whether this project should be part of this WARP-V repo, the 1st CLaaS repo, or its own repo.)

Background Info

WARP-V is configurable via parameters in the code. These parameters take the form of M4 macro preprocessor defines. For example:

m4_default(['M4_BRANCH_PRED'], ['fallthrough'])

Other configurations include pipeline staging, and support for M, F, and B extensions, as well as parameters for many-core.

Phase 1: Configuration

The WARP-V parameters will evolve over time, so it is important that the configuration front-end be easy to extend for someone who is not proficient with web technologies. There may be an opportunity to leverage an existing open-source settings panel with a well-documented API for defining parameters, though we'll want the GUI for some parameters to be custom. Perhaps a JSON parameter spec should be used as input.

The configuration front-end will produce JSON output characterizing the configuration.

Configuration JSON

Here is a sketch of some example JSON describing a CPU configuration:

{ "cores": 2,
  "vcs": 2,
  "prios": 2,
  "max_packet_size": 3,
  "impl": false,
  // These are sequential. M4_YYY = M4_XXX + yyy_inc.
  // M4_NEXT_PC stage always 1
  "fetch_stage_inc": 0,  // 0...
  "decode_stage_inc": 0,  // 0...
  "branch_pred_stage_inc": 0, // Checkbox for 0/1, though JSON supports > 1.
  "reg_rd_stage_inc": 0,
  "execute_stage_inc": 0,
  "result_stage": 0,
  "reg_wr_stage": 0,
  "mem_wr_stage": 0,

  "ld_return_align": 1, // 0...
}

This will produce the following M4/TL-Verilog code (where comments don't matter):

   m4_define_hier(['M4_CORE'], 2)  // Number of cores.
      // (if M4_CORE_CNT > 1)
      m4_define_hier(['M4_VC'], 2)    // VCs (meaningful if > 1 core).
      m4_define_hier(['M4_PRIO'], 2)  // Number of priority levels in the NoC.
      m4_define(['M4_MAX_PACKET_SIZE'], 3)   // Max number of payload flits in a packet.
   m4_default(['M4_IMPL'], 0)  // For implementation (vs. simulation).
   // A hook for a software-controlled reset. None by default.
   m4_define(['m4_soft_reset'], 1'b0)
   // A hook for CPU back-pressure in M4_REG_RD_STAGE.
   // Various sources of back-pressure can add to this expression.
   // Currently, this is envisioned for CSR writes that cannot be processed, such as
   // NoC packet writes.
   m4_define(['m4_cpu_blocked'], 1'b0)


   // Define the implementation configuration, including pipeline depth and staging.
   // Define the following:
   //   Stages:
   //     M4_NEXT_PC_STAGE: Determining fetch PC for the NEXT instruction (not this one).
   //     M4_FETCH_STAGE: Instruction fetch.
   //     M4_DECODE_STAGE: Instruction decode.
   //     M4_BRANCH_PRED_STAGE: Branch predict (taken/not-taken). Currently, we mispredict to a known branch target,
   //                           so branch prediction is only relevant if target is computed before taken/not-taken is known.
   //                           For other ISAs prediction is forced to fallthrough, and there is no pred-taken redirect.
   //     M4_REG_RD_STAGE: Register file read.
   //     M4_EXECUTE_STAGE: Operation execution.
   //     M4_RESULT_STAGE: Select execution result.
   //     M4_BRANCH_TARGET_CALC_STAGE: Calculate branch target (generally EXECUTE, but some designs
   //                                  might produce offset from EXECUTE, then compute target).
   //     M4_MEM_WR_STAGE: Memory write.
   //     M4_REG_WR_STAGE: Register file write.
   //     Deltas (default to 0):
   //       M4_DELAY_BRANCH_TARGET_CALC: 1 to delay branch target calculation 1 stage from its nominal (ISA-specific) stage.
   //   Latencies (default to 0):
   //     M4_LD_RETURN_ALIGN: Alignment of load return pseudo-instruction into |mem pipeline.
   //                         If |mem stages reflect nominal alignment w/ load instruction, this is the
   //                         nominal load latency.
   //     Deltas (default to 0):
   //       M4 EXTRA_PRED_TAKEN_BUBBLE: 0 or 1. 0 aligns PC_MUX with BRANCH_TARGET_CALC.
   //       M4_EXTRA_REPLAY_BUBBLE:     0 or 1. 0 aligns PC_MUX with RD_REG for replays.
   //       M4_EXTRA_JUMP_BUBBLE:       0 or 1. 0 aligns PC_MUX with EXECUTE for jumps.
   //       M4_EXTRA_PRED_TAKEN_BUBBLE: 0 or 1. 0 aligns PC_MUX with EXECUTE for pred_taken.
   //       M4_EXTRA_INDIRECT_JUMP_BUBBLE: 0 or 1. 0 aligns PC_MUX with EXECUTE for indirect_jump.
   //       M4_EXTRA_BRANCH_BUBBLE:     0 or 1. 0 aligns PC_MUX with EXECUTE for branches.
   //       M4_EXTRA_TRAP_BUBBLE:       0 or 1. 0 aligns PC_MUX with EXECUTE for traps.
   //   M4_BRANCH_PRED: {fallthrough, two_bit, ...}
   //   M4_DATA_MEM_WORDS: Number of data memory locations.
   m4_defines(
      (M4_NEXT_PC_STAGE, 0),
      (M4_FETCH_STAGE, 0),
      (M4_DECODE_STAGE, 0),
      (M4_BRANCH_PRED_STAGE, 0),
      (M4_REG_RD_STAGE, 0),
      (M4_EXECUTE_STAGE, 0),
      (M4_RESULT_STAGE, 0),
      (M4_REG_WR_STAGE, 0),
      (M4_MEM_WR_STAGE, 0),
      (M4_LD_RETURN_ALIGN, 1))
   m4_default(['M4_BRANCH_PRED'], ['fallthrough'])
   m4_define_hier(['M4_DATA_MEM_WORDS'], 32)

Phase 2: TL-Verilog

JSON CPU configuration will be translated to a main TL-Verilog file that produces a WARP-V core with the given configuration. This translation will be performed by a microservice that receives a GET or POST request with a JSON configuration parameter and returns the TL-Verilog code. Implementing this function as a microservice enables WARP-V code generation to be used in various contexts:

  • scripted generation of M4/TL-Verilog source, or combined with SandPiper SaaS, non-M4 TL-Verilog, Verilog, or SystemVerilog
  • within a web page
  • within makerchip

TL-Verilog code could be opened in Makerchip in two ways:

  • The json output of the front-end can be provided as a GET parameter in a URL: http://www.makerchip.com/sandbox?code_url=..., where ... references the new microservice with json provided as a GET parameter.
  • Makerchip supports endpoint to create a project, open the project in makerchip, and delete a project. (See makerchip-app for example use.)

Phase 3: Assembly Code

The pre-existing WARP-V example app in 1st CLaaS provides a front-end capable of assembling assembly code via the built-in WARP-V assembler by calling SandPiper SaaS with TL-V code that references a WARP-V include file. This binary code can then be delivered to a WARP-V microservice.

This assembler front-end should be incorporated into the configuration front-end, where the assembly code becomes part of the configuration, to generate a CPU with a hard-coded program. (The WARP-V assembler is only available for RISC-V configurations, today.)

The pre-existing flow should also be supported, where the code is assembled and then loaded into the CPU's instruction memory via the 1st CLaaS WebSocket.

Phase 4: Versioning WARP-V

The WARP-V codebase on which the 1st CLaaS build operates and the WARP-V used by the front-end configuration and assembler should be the same.

Phase 5: Deployment via Front-End

Let's hold off on this. If we can load in Makerchip, we should focus on general deployment via Makerchip. We could, however, provide clear instructions and automation for bundling the CPU as a 1st CLaaS app.

Phase 6: Marketing

Provide a great landing page and make noise on social media.

Phase 7: Better GUI

For pipeline staging we could provide a GUI like:

Pipeline Diagram

We could also consider using these parameters in a simplified mock-up CPU to using SandPiper SaaS to generate logic diagrams from configured code.

Related Projects