[chirp_devel] [PATCH][BTECH] Bug fix about radios resetting on the download, fixes #3015

newer
Re: [chirp_devel] [PATCH][BTECH]...

older
[chirp_devel] [PATCH][BTECH] Bug...

M.Sc. Pavel Milanes Costa

5 Apr 2016 5 Apr '16

1:53 p.m.

Hi to all, busy weekend and week ahead here, so patching will be slow on this side.

Patch comments:

The radios with the second ID can run into a conditions in which the timeouts concatenates and get the radios resets after ident and before download.

Affected models QYT KT 8900 (& the clone JT-6188) & BTECH 2501+220 so far.

Setting a temporal lower timeout in the second ID part has proved to fix this issue.

73 Pavel.

Attachments:

timeoutruncond.patch (text/x-patch — 1.5 KB)

Show replies by date

Dan Smith

5 Apr 5 Apr

2:25 p.m.

...

 if radio._id2 is not False:

   # lower the timeout here as this radios are reseting due to timeout

```
   radio.pipe.setTimeout(0.05)
```

   # query & receive the extra ID
   _send(radio, _make_frame("S", 0x3DF0, 16))
   id2 = _rawrecv(radio, 21)

This just means that you're reading too much here, right? Do the radios have different sized responses? Can you not read the appropriate amount here to avoid the timeout entirely?

--Dan

M.Sc. Pavel Milanes Costa

4:41 p.m.

El 05/04/16 a las 10:25, Dan Smith via chirp_devel escribió:

...

...
  if radio._id2 is not False:
   # lower the timeout here as this radios are reseting due to timeout
   radio.pipe.setTimeout(0.05)
    # query & receive the extra ID
    _send(radio, _make_frame("S", 0x3DF0, 16))
    id2 = _rawrecv(radio, 21)
This just means that you're reading too much here, right? Do the radios have different sized responses? Can you not read the appropriate amount here to avoid the timeout entirely?

Hi Dan,

No, it means that the radio answer with a variable length, the default serial timeout is 0.7 secs to allow all radio to fit in, but some radios has a second id (this part of the code) that must be read before any operation, this second Id is a simple read of 16 bytes (21 with ACK+headers) in the upper memmap.

But most of the time the radio answer with less than the 21 bytes (some times_it does answe__r _with 21 bytes) then the driver wait for the extra byte until the default serial timeout, meanwhile the radio see no activity and resets itself getting out of the clone mode.

That's why I have lowered the timeout here to a safe value of 0.05 secs that practice has showed is stable at least with our test scenario.

This lower timeout is good also in the next portions of the code on the upload procedure that this radios has to read some 1 or 2 bytes from the radio always being the last one an ACK.

At the end of the ident I reset the serial timeout to the default 0.7 secs to be safe.

Another approach would be reading less than the 21 bytes and doing a serial flush after just to be sure. We only need to check for an specific string on this data block depending or radio model, but will work only if the string is always far of the end of the block (I have to check that yet).

This way is more intrusive and will require more code changes, I realized of this alternative way right now.

--- Pavel.

Dan Smith

10:57 p.m.

...

No, it means that the radio answer with a variable length

You don't really think that the radio is answering with a variable length, right? Or are you saying we don't always know whether we need to read 8 or 16 bytes because we haven't identified the model yet? The radio isn't really capable of sending a random length of data...

...

This way is more intrusive and will require more code changes, I realized of this alternative way right now.

I really think that it's most likely that you're reading the blocks at different sizes than you should, which causes you to get out of sync, and thus depend on the timeouts to avoid hanging too long in between them. That would explain why the download and upload performance differs so much -- sometimes we get lucky and don't hit a lot of timeouts, but if we get out of sync early, we hit the timeout on each block, introducing a couple hundred milliseconds of unneeded delay on each round.

If the radio is really writing blocks in different sizes with varying delays through the image, then the fully buffered approach I described earlier is definitely the way to go...

--Dan

M.Sc. Pavel Milanes Costa

6 Apr 6 Apr

2:06 p.m.

El 05/04/16 a las 18:57, Dan Smith via chirp_devel escribió:

...

...
No, it means that the radio answer with a variable length

You don't really think that the radio is answering with a variable length, right? Or are you saying we don't always know whether we need to read 8 or 16 bytes because we haven't identified the model yet? The radio isn't really capable of sending a random length of data...

No, I have confirmation by the first ID that this radio is the model I know it's; but the OEM reads the second ID of the radio and we are copying this behavior... _and now I realize that maybe we don't need to do that_, as we know for sure that this is the radio it's.

Jim, we will test in that way this night.

This read of the second ID is way beyond in the high mem area, the area that is not touched by the OEM software, we are doing a simple and common read but this radios has the bad behavior (bug? feature? flag?) to send a misplaced \x05 byte on the first reads... and this misplaced \x05 is the one that is doing the block shorter or longer.

The OEM always use the 64 bytes length on the reads, but this particular high mem reads is always 16 bytes, the restriction you see later in the code about almost 16 bytes is because the string we need is in the lower 16 bytes.

This can be another approach, send the request for the 16 bytes and read at least 16 (from the 21 it must be) then process it and at the end do a serial flush to get the buffer clean (or a dummy read of the 5 following bytes)

...

...
This way is more intrusive and will require more code changes, I realized of this alternative way right now.

I really think that it's most likely that you're reading the blocks at different sizes than you should, which causes you to get out of sync, and thus depend on the timeouts to avoid hanging too long in between them.

No, the OEM doit in this sizes as the logs shows, and we are doing it the same size. The OEM always use the 64 bytes length on the reads, but this particular high mem reads is always 16 bytes only.

...

That would explain why the download and upload performance differs so much -- sometimes we get lucky and don't hit a lot of timeouts, but if we get out of sync early, we hit the timeout on each block, introducing a couple hundred milliseconds of unneeded delay on each round.

Yes, it can be.

...

If the radio is really writing blocks in different sizes with varying delays through the image, then the fully buffered approach I described earlier is definitely the way to go...

No, the read size (64 bytes) is constant trough the read, the same we have on the write to the radio with 16 bytes on each bloc, but this particular high mem reads is always 16 bytes only.

Yes, I'm playing with an approach on buffering the whole data stream and them process it, of curse on another similar scenario, to get used to it and maybe implement it here.

Dan, I vote for applying the patch as Jim reported it works and it don't break anything; we can test further on this two paths and maybe we can manage this to work:

- Test if we can avoid this second ID read, as we now for sure with the normal ID that we have the correct radio. - Test the request 16 bytes as usual, but reads at least 16 to process and then flush the serial, to get the buffer clean for the next steps.

So it's your call, apply it now or wait to test the two mentioned paths to see if one of them works better?

The WACCOM Mini-8900 is another story (bug) with it's annoying \x05 byte popping around and the OEM doing silent reties when it found the \x05 byte in a specific position. It's like the OEM know it will not work and retry the entire process from top.

Cheers, Pavel.

Dan Smith

8 Apr 8 Apr

9:31 p.m.

...

This read of the second ID is way beyond in the high mem area, the area that is not touched by the OEM software, we are doing a simple and common read but this radios has the bad behavior (bug? feature? flag?) to send a misplaced \x05 byte on the first reads... and this misplaced \x05 is the one that is doing the block shorter or longer.

Before or after the block you have to read the longer block? Anything you can do to anticipate that the longer read is needed would be good.

Can you read the id block after the rest of the clone is done? You can always stitch the image together later if need be.

...

The OEM always use the 64 bytes length on the reads, but this particular high mem reads is always 16 bytes, the restriction you see later in the code about almost 16 bytes is because the string we need is in the lower 16 bytes.

Meaning the OEM software asks for 64 bytes in its communication with the radio, or that the radio returns 64 bytes?

I don't think that the portmon-reported length is necessarily the length that the software is reading, but rather the amount returned to the application.

...

This can be another approach, send the request for the 16 bytes and read at least 16 (from the 21 it must be) then process it and at the end do a serial flush to get the buffer clean (or a dummy read of the 5 following bytes)

Any time you're doing a flush or throwing away data in the middle of a clone procedure, I'm highly concerned because it means we don't know what all we're reading ...

...

No, the read size (64 bytes) is constant trough the read, the same we have on the write to the radio with 16 bytes on each bloc, but this particular high mem reads is always 16 bytes only.

So we could go with a timeout of 5 minutes then right? I'm being hyperbolic, but what I mean is, you only ever need a short timeout if you're reading more than is available...

--Dan

Pavel Milanes (CO7WT)

9 Apr 9 Apr

2:07 p.m.

El 08/04/16 a las 17:31, Dan Smith via chirp_devel escribió:

...

...
This can be another approach, send the request for the 16 bytes and read

...
at least 16 (from the 21 it must be) then process it and at the end do a serial flush to get the buffer clean (or a dummy read of the 5 following bytes)

Any time you're doing a flush or throwing away data in the middle of a clone procedure, I'm highly concerned because it means we don't know what all we're reading ...

Yes, further tests showed that is better to read with a lower timeout that use this approach of a short read a then a dump.

Pavel Milanes (CO7WT)

5:59 p.m.

El 08/04/16 a las 17:31, Dan Smith via chirp_devel escribió:

...

So we could go with a timeout of 5 minutes then right? I'm being hyperbolic, but what I mean is, you only ever need a short timeout if you're reading more than is available...

The general (default) timeout is set to 0.7 secs, because we have a bunch of radios to deal with, and this radios put some particular pauses (each models has it's own pauses as high as 0.5 secs in some cases) between the request and the answer of a block.

Aka: we have really a ~0.2 sec timeout for the radios that has the long pause, and the others just continue when it has the right amount of data (69 bytes), this works ok with all radios with no timeouts or other failures.

The problem is in the radios with the read in the high mem zone (second id) it just read 16 bytes instead of the normal 64 bytes, so the timeout must be short here because in this particular moment some times the radios put a misplaced \x05 (we call it 'wrong ACK', it can be a bug or a feature, who knows) that make the requested block of 16 bytes short by one.

normally ACK + 4 bytes of headers + 16 bytes payload (21 bytes) with the wrong ack it's like this: WRONG-ACK + 4 bytes of headers + 15 bytes payload (20 bytes)

When the wrong ack appears in the stream the block is just 20 bytes instead of the normal 21; so the driver get waiting to the normal 0.7 sec of timeouts and the radio resets getting itself out of the clone mode.

This is the root of the problem, my way of solving that is to lower the timeout to allow to capture at least the 20 incomplete bytes and timeout ASAP to keep in sync with the radio, because we only care about 12 bytes in there and don't care about if the full 16 or 15 byte were received, I'm happy with the first 12 in the payload because in there is out check string.

In fact the rest of the 4 or 3 bytes are different in each radio model, some times full \x00, some times \xFF and even \x20

Doing it like this allows me to keep going fast into the normal read and it works.

Cheers.

Jim Unroe

5 Apr 5 Apr

10:57 p.m.

On Tue, Apr 5, 2016 at 9:53 AM, M.Sc. Pavel Milanes Costa via chirp_devel chirp_devel@intrepid.danplanet.com wrote:

...

Hi to all, busy weekend and week ahead here, so patching will be slow on this side.

Patch comments:

The radios with the second ID can run into a conditions in which the timeouts concatenates and get the radios resets after ident and before download.

Affected models QYT KT 8900 (& the clone JT-6188) & BTECH 2501+220 so far.

Pavel,

This works with all of my BTech radios (UV-2501, UV-2501+220 & UV-5001) in both download and upload in Windows and Linux.

The WACCOM Mini-8900 Plus only works in Window. With Linux, it fails to download after successfully identifying the radio and downloading the block that gets discarded.

Jim KC9HI

M.Sc. Pavel Milanes Costa

6 Apr 6 Apr

1:35 p.m.

El 05/04/16 a las 18:57, Jim Unroe escribió:

...

On Tue, Apr 5, 2016 at 9:53 AM, M.Sc. Pavel Milanes Costa via chirp_devel chirp_devel@intrepid.danplanet.com wrote:

...
Hi to all, busy weekend and week ahead here, so patching will be slow on this side.

Patch comments:

The radios with the second ID can run into a conditions in which the timeouts concatenates and get the radios resets after ident and before download.

Affected models QYT KT 8900 (& the clone JT-6188) & BTECH 2501+220 so far.

Pavel,

This works with all of my BTech radios (UV-2501, UV-2501+220 & UV-5001) in both download and upload in Windows and Linux.

The WACCOM Mini-8900 Plus only works in Window. With Linux, it fails to download after successfully identifying the radio and downloading the block that gets discarded.

Jim KC9HI .

So the patch woks as expected.

The Waccom Mini-8900 on Linux is another issue with a misplaced \x05 that happen only on Linux, I have a theory that I have to test yet before patching.

Jim Unroe

2:10 p.m.

...

So the patch woks as expected.

Yes. It does. But from what I am reading here, Dan would like to see you do it differently.

Jim

3372

Age (days ago)

3376

Last active (days ago)

List overview

Download

10 comments

4 participants

tags (0)

participants (4)

Dan Smith
Jim Unroe
M.Sc. Pavel Milanes Costa
Pavel Milanes (CO7WT)