Skip to content

Commit

Permalink
Revert "Reordering columns"
Browse files Browse the repository at this point in the history
Column reordering is broken. 😢

This reverts commit be87768.
  • Loading branch information
joeygibson committed Apr 23, 2021
1 parent 680cb74 commit 0803038
Show file tree
Hide file tree
Showing 8 changed files with 83 additions and 391 deletions.
4 changes: 0 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
private-data
verticat
xxx.bin
reo1.txt
avtr.txt
avt.txt
verticareader
82 changes: 10 additions & 72 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,89 +10,27 @@ from [Vertica native binary files](https://www.vertica.com/docs/9.3.x/HTML/Cont
## Usage

```bash
Usage: verticat [-cfHpv] [-h value] [-o value] [-r value] [-t value] [file...]
Usage: verticat [-cfHv] [-h value] [-o value] [-t value] [file...]
-c, --count count rows
-f, --force force overwrite of output file
-h, --head=value take the first n rows
-H, --help show help
-o, --output=value
write head/tail results to this file
-p, --print-header
print out header and exit
-r, --reorder=value
reorder columns based on this file
-t, --tail=value take the last n rows
-v, --version show version
```

## Options
Running with `--count` will print out the number of data rows in the file(s). The header does not count as a row.

`--count` or `-c` will print out the number of data rows in the file(s).
The header does not count as a row.
Running with `--head` or `--tail` will copy the first `n` rows, or the last `n` rows, respectively, to the output. When
either is used with multiple files, `n` rows will be taken from each file, and written to the output.

`--force`, or `-f` will overwrite an output file, if it exists.
If `--output` is not specified, `stdout` is used. Be careful with this, since binary data echoed to the console may be
undesirable.

`--head`, or `-h` copies the first `n` rows of the given file(s). If multiple files are
given, they must all have an identical layout, as the header of the first will be
copied with all of the output.
If combining multiple files with `--head`, `--tail`, or no options, they must all share the same column layout.

`--help`, or `-H` prints usage information.

`--output`, or `-o` gives a filename to send the output to. If this is not specified,
output goes to `stdout`.

`--print-header`, or `-p` will print the list of column widths, in order, and exit.

`--reorder`, or `-r`, specifies a file with the the desired column reordering. The
indices can be separated by any sort of whitespace (spaces, tabs, newslines, etc.).

`--tail`, or `-t`, copies the last `n` rows of the given file(s). As with `--head`,
if multiple files are given, they must all have an identical layout, as the header
of the first will be copied with all of the output.

`--version`, or `-v` display the program version.

## Notes

If no filenames are given, `verticat` acts like the standard `cat` program, reading
from `stdin`. **This only works with a single file**, since Vertica native files
have a header. If you need to combine multiple files, give them as arguments to the
program itself.

## Examples

To read the first 5 lines of `foo.bin`, and write the output to `bar.bin`,

```bash
verticat --head 5 -o bar.bin foo.bin
```

To read the last 5 lines of `foo.bin`, and write the output to `bar.bin`,

```bash
verticat --tail 5 -o bar.bin foo.bin
```

To read the first 5 lines of `foo.bin`, `bar.bin`, and `baz.bin`, and write the
output to `quux.bin`

```bash
verticat --head 5 -o quux.bin foo.bin bar.bin baz.bin
```

To combine all of `foo.bin` and `bar.bin` and write the output to `baz.bin`

```bash
verticat -o baz.bin foo.bin bar.bin
```

To reorder `foo.bin` using the ordering specified in `reo.txt`, and write to `stdout`.
In this example, the first five columns will be written in reverse order, and the final
four in their original order.

```bash
$> cat reo.txt
5, 4, 3, 2, 1, 6, 7, 8, 9

$> verticat -r reo.txt foo.bin
```
If no options are given, `verticat` acts like the standard `cat` program, reading from `stdin` if no files are given. Since
Vertica native files start with a metadata header, if you want to cat multiple files together, specify them as arguments
to `verticat` itself.
16 changes: 0 additions & 16 deletions avtr.txt

This file was deleted.

24 changes: 4 additions & 20 deletions lib/column_definitions.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ type ColumnDefinitions struct {
Filler byte
NumberOfColumns uint16
Widths []uint32
NewColumnOrder []uint
}

func (c ColumnDefinitions) Write(file io.Writer) (err error) {
Expand All @@ -36,29 +35,15 @@ func (c ColumnDefinitions) Write(file io.Writer) (err error) {
return
}

if c.NewColumnOrder == nil {
err = binary.Write(file, binary.LittleEndian, c.Widths)
if err != nil {
return
}
} else {
// Re-order the column widths based on the specified order
orderedWidths := make([]uint32, len(c.Widths))

for i, val := range c.NewColumnOrder {
orderedWidths[i] = c.Widths[val - 1]
}

err = binary.Write(file, binary.LittleEndian, orderedWidths)
if err != nil {
return
}
err = binary.Write(file, binary.LittleEndian, c.Widths)
if err != nil {
return
}

return nil
}

func ReadColumnDefinitions(file *os.File, newColumnOrder []uint) (ColumnDefinitions, error) {
func ReadColumnDefinitions(file *os.File) (ColumnDefinitions, error) {
var headerLength uint32
var version uint16
var filler byte
Expand Down Expand Up @@ -97,7 +82,6 @@ func ReadColumnDefinitions(file *os.File, newColumnOrder []uint) (ColumnDefiniti
Filler: filler,
NumberOfColumns: numberOfColumns,
Widths: widths,
NewColumnOrder: newColumnOrder,
}

return definitions, nil
Expand Down
2 changes: 1 addition & 1 deletion lib/column_definitions_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ func TestReadColumnDefinitions(t *testing.T) {
t.Fatal("signature didn't match")
}

definitions, err := ReadColumnDefinitions(file, nil)
definitions, err := ReadColumnDefinitions(file)
if err != nil {
t.Fatal("error reading column definitions: ", err)
}
Expand Down
Loading

0 comments on commit 0803038

Please sign in to comment.