tx_clk在50MHz时域内是5 MHz选通(10个周期内1个高电平)吗?
如果没有,这里的一些假设是无效的。
要查看所有延迟的位置,您必须显示tx_clk strobe和frame_end是如何生成的,因为您描述的移位器没有超出这些信号引起的延迟。
当txclk处于空闲状态时,clk边沿产生txout。
[但你处于闲置状态吗?]
由于您的描述表明tx_clk和frame_end都同步到50MHz并且tx_clk发生[大约,也许]每10个时钟中有1个,因此frame_end中的延迟将导致200ns的额外输出延迟。
它必须是tx_clk选通的产生,其中感觉到大部分延迟。
要将外部触发或慢速时钟与50MHz域正确同步通常需要20-40ns:一个时钟沿采样外部信号,下一个时钟沿使用所有相关寄存器中的单个采样值。
这可以通过在50 MHz下降沿采样触发器并在上升沿使用此同步值来减少额外的同步延迟到10-20ns来减少。
我怀疑你的tx_clk选通再次注册以增加另外20ns而不是使用你注册的外部信号及其延迟版本的组合值(线)。
一个问题,你可以帮助考虑代码的功能:框架之间发生了什么?
你期待坐在闲置状态吗?
你是否期望积极转移空移位寄存器(全零)?
作为一个经验丰富的FPGA人员,我设计的移位寄存器没有状态,外部触发器上发生负载并在其余时间移位。
如果外部触发可能在帧完成后立即发生(或更快?!),则将使用附加逻辑来保证对同步控制的正确处理。
我不会使用单独的always @ *块,而是使用条件运算符(a 以下为原文
So clk is 50MHz. Is tx_clk a 5 MHz strobe (1 high in 10 periods) in the 50MHz time domain? If not, some assumptions here are invalid.
To see where all the delays are, you'd have to show how the tx_clk strobe and frame_end are generated since the shifter you describe has no more delays beyond what those signals cause. The clk edge produces a txout when txclk is active while in the idle state. [But are you in the idle state?]
Since your description suggests both tx_clk and frame_end are synchronized to the 50MHz and tx_clk occurs [approximately, perhaps] 1 out of every 10 clocks, a delay in frame_end would cause a 200ns additional output delay. It must be the generation of the tx_clk strobe where most of the delay is felt.
To properly synchronize an external trigger or slow clock to the 50MHz domain would typically take 20-40ns: one clock edge to sample the external signal, the next clock edge to use the single sampled value across all dependent registers. This could be reduced by sampling the trigger on the 50 MHz falling edge and using this synchronized value on the rising edge reducing the additional synchonization delay to 10-20ns. I suspect that your tx_clk strobe is registered again to add another 20ns rather than using a combinatorial value (wire) from your registered external signals and their delayed version.
A question for you to help consider the functionality of your code: what happens between frames? Do you expect to be sitting in the idle state? Do you expect to be actively shifting an empty shift register (all zeros)?
As a seasoned FPGA guy, I'd design the shift register without states with a load that occurs on the external trigger and shift for the rest of the time. If the external trigger could occur immediately after a frame is completed (or sooner?!) additional logic would be used to guarantee proper handling of the synchronized control. I wouldn't use a separate always @* block and I'd use a conditional operator (a<=b?c:d;) for load vs shift. Concatenation {a,b} is also a very useful item for both the rhs and lhs of the equation. The module you've produced could be digested to a simple clocked always block with one line each in the "if( reset )" and "else if( tx_clk )" branches, actually increasing readability and understanding. It's the logic that generates the other synchonized signals that really defines the delays you're seeing.