Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON configuration persistence #17

Merged
merged 11 commits into from
Aug 10, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions conduit-config-persistence/build.gradle
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
plugins {
id 'java'
id "com.diffplug.spotless" version "5.1.1"
}

group 'io.dataline.conduit'
version '0.1.0'

repositories {
mavenCentral()
}

compileJava.dependsOn 'spotlessApply'

spotless {
java {
googleJavaFormat('1.8')
}
}

dependencies {
testCompile group: 'junit', name: 'junit', version: '4.12'

compile group: "com.fasterxml.jackson.core", name: "jackson-databind", version: "2.9.8"
compile group: "com.networknt", name: "json-schema-validator", version: "1.0.42"

implementation project(':conduit-config')

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
package io.dataline.conduit.persistence;

import java.util.Set;

public interface ConfigPersistence {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should all these methods throw so we can provide appropriate messages on the UI if needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you want them to throw? right now i am catching all I/O exceptions internally and trying to put reasonable error messages.

Copy link
Contributor

@sherifnada sherifnada Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have a throw in the signature of the methods, then the server has to handle exceptions well. Unless the impl you're going for is that these methods would throw reasonable messages and the server autowraps any failures nicely in a 500/400 in the UI? (and therefore doesn't need to wrap each individual failure)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

server autowraps any failures nicely in a 500/400

was not planning to do this.

can you give an example of what behavior you want? specifically what error you expect to have thrown and how the ui would interact with it?

Copy link
Contributor

@sherifnada sherifnada Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a config file is formatted incorrectly (fails validation), the UI should display a message saying for example: file is corrupt. Or if it is not found, should say file not found, please make sure your mount point is setup correctly. I'm imagining that since the server would be calling this persistence class, if the persistence class fails, we want to respond to the UI with a 400/500 with a human-readable message. I see two ways of doing this:

  1. every method throws, and the server wraps that and throws based on position of the failure in control flow. This is higher friction since the UI server must try-catch persistence operations.
  2. throw all un-fixable exceptions at the lowest level, and in the server whenever an exception is thrown, respond with a 500 whose message is exactly the failure message of the thrown exception

It seems like this impl is leaning towards no.2 above? this seems like a good choice since it's lowest friction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can i punt on this until write some API endpoints and have a better feel for what will be ergonomic?

<T> T getConfig(PersistenceConfigType persistenceConfigType, String configId, Class<T> clazz);

<T> Set<T> getConfigs(PersistenceConfigType persistenceConfigType, Class<T> clazz);

<T> void writeConfig(
PersistenceConfigType persistenceConfigType, String configId, T config, Class<T> clazz);
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
package io.dataline.conduit.persistence;

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.networknt.schema.*;
import io.dataline.conduit.conduit_config.ConfigSchema;
import io.dataline.conduit.conduit_config.StandardScheduleConfiguration;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Optional;
import java.util.Set;
import java.util.stream.Collectors;

public class ConfigPersistenceImpl implements ConfigPersistence {
private static final String CONFIG_STORAGE_ROOT = "data/config/";
private static final String CONFIG_SCHEMA_ROOT = "conduit-config/src/main/resources/json/";

private final ObjectMapper objectMapper;
private final SchemaValidatorsConfig schemaValidatorsConfig;
private final JsonSchemaFactory jsonSchemaFactory;

public ConfigPersistenceImpl() {
objectMapper = new ObjectMapper();
schemaValidatorsConfig = new SchemaValidatorsConfig();
jsonSchemaFactory = JsonSchemaFactory.getInstance(SpecVersion.VersionFlag.V7);
}

@Override
public <T> T getConfig(
PersistenceConfigType persistenceConfigType, String configId, Class<T> clazz) {
// find file
File configFile = getFileOrThrow(persistenceConfigType, configId);

// validate file with schema
validateJson(configFile, persistenceConfigType, configId);

// cast file to type
return fileToPojo(configFile, clazz);
}

@Override
public <T> Set<T> getConfigs(PersistenceConfigType persistenceConfigType, Class<T> clazz) {
return getConfigIds(persistenceConfigType).stream()
.map(configId -> getConfig(persistenceConfigType, configId, clazz))
.collect(Collectors.toSet());
}

@Override
public <T> void writeConfig(
PersistenceConfigType persistenceConfigType, String configId, T config, Class<T> clazz) {
try {
objectMapper.writeValue(new File(getConfigPath(persistenceConfigType, configId)), config);
} catch (IOException e) {
throw new RuntimeException(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these methods throw an IOException instead of rethrowing as a runtime exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can i punt on this until write some API endpoints and have a better feel for what will be ergonomic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

}
}

private JsonSchema getSchema(PersistenceConfigType persistenceConfigType) {
String configSchemaFilename =
standardConfigTypeToConfigSchema(persistenceConfigType).getSchemaFilename();
File schemaFile = new File(String.format("%s/%s", CONFIG_SCHEMA_ROOT, configSchemaFilename));
JsonNode schema;
try {
schema = objectMapper.readTree(schemaFile);
} catch (IOException e) {
throw new RuntimeException(e);
}
return jsonSchemaFactory.getSchema(schema, schemaValidatorsConfig);
}

private Set<Path> getFiles(PersistenceConfigType persistenceConfigType) {
String configDirPath = getConfigDirectory(persistenceConfigType);
try {
return Files.list(new File(configDirPath).toPath()).collect(Collectors.toSet());
} catch (IOException e) {
throw new RuntimeException(e);
}
}

private String getConfigDirectory(PersistenceConfigType persistenceConfigType) {
return String.format("%s/%s", CONFIG_STORAGE_ROOT, persistenceConfigType.toString());
}

private String getConfigPath(PersistenceConfigType persistenceConfigType, String configId) {
return String.format("%s/%s", getConfigDirectory(persistenceConfigType), getFilename(configId));
}

private Set<String> getConfigIds(PersistenceConfigType persistenceConfigType) {
return getFiles(persistenceConfigType).stream()
.map(path -> path.getFileName().toString().replace(".json", ""))
.collect(Collectors.toSet());
}

private Optional<Path> getFile(PersistenceConfigType persistenceConfigType, String id) {
String configPath = getConfigPath(persistenceConfigType, id);
final Path path = Paths.get(configPath);
if (Files.exists(path)) {
return Optional.of(path);
} else {
return Optional.empty();
}
}

private String getFilename(String id) {
return String.format("%s.json", id);
}

private ConfigSchema standardConfigTypeToConfigSchema(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of having two enums that are identical instead of just one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConfigSchema is mapping for name to file path. PersistenceConfigType is mean to be used as part of the interface but not tied to a file path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any expectation for these to diverge in the future? This sort of switch/mapping feels like a bit of an antipattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure yet. i'm already regretting custom json disk persistence due to lack of ability to query by anything other than id.

PersistenceConfigType persistenceConfigType) {
switch (persistenceConfigType) {
case SOURCE_CONNECTION_CONFIGURATION:
return ConfigSchema.SOURCE_CONNECTION_CONFIGURATION;
case STANDARD_CONNECTION_STATUS:
return ConfigSchema.STANDARD_CONNECTION_STATUS;
case STANDARD_DISCOVERY_OUTPUT:
return ConfigSchema.STANDARD_DISCOVERY_OUTPUT;
case DESTINATION_CONNECTION_CONFIGURATION:
return ConfigSchema.DESTINATION_CONNECTION_CONFIGURATION;
case STANDARD_SYNC_CONFIGURATION:
return ConfigSchema.STANDARD_SYNC_CONFIGURATION;
case STANDARD_SYNC_SUMMARY:
return ConfigSchema.STANDARD_SYNC_SUMMARY;
case STANDARD_SYNC_STATE:
return ConfigSchema.STANDARD_SYNC_STATE;
case STATE:
return ConfigSchema.STATE;
case STANDARD_SYNC_SCHEDULE:
return ConfigSchema.STANDARD_SYNC_SCHEDULE;
default:
throw new RuntimeException(
String.format(
"No mapping from StandardConfigType to ConfigSchema for %s",
persistenceConfigType));
}
}

private void validateJson(
File configFile, PersistenceConfigType persistenceConfigType, String configId) {
JsonNode configJson;
try {
configJson = objectMapper.readTree(configFile);
} catch (IOException e) {
throw new RuntimeException(e);
}

JsonSchema schema = getSchema(persistenceConfigType);
Set<ValidationMessage> validationMessages = schema.validate(configJson);
if (validationMessages.size() > 0) {
throw new IllegalStateException(
String.format(
"json schema validation failed. type: %s id: %s \n errors: %s \n schema: \n%s \n object: \n%s",
persistenceConfigType,
configId,
validationMessages.stream()
.map(ValidationMessage::toString)
.collect(Collectors.joining(",")),
schema.getSchemaNode().toPrettyString(),
configJson.toPrettyString()));
}
}

private File getFileOrThrow(PersistenceConfigType persistenceConfigType, String configId) {
return getFile(persistenceConfigType, configId)
.map(Path::toFile)
.orElseThrow(
() ->
new RuntimeException(
String.format("config %s %s not found", persistenceConfigType, configId)));
}

private <T> T fileToPojo(File file, Class<T> clazz) {
try {
return objectMapper.readValue(file, clazz);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
package io.dataline.conduit.persistence;

public enum PersistenceConfigType {
SOURCE_CONNECTION_CONFIGURATION,
STANDARD_CONNECTION_STATUS,
STANDARD_DISCOVERY_OUTPUT,
DESTINATION_CONNECTION_CONFIGURATION,
STANDARD_SYNC_CONFIGURATION,
STANDARD_SYNC_SUMMARY,
STANDARD_SYNC_STATE,
STATE,
STANDARD_SYNC_SCHEDULE
}
1 change: 1 addition & 0 deletions conduit-config/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,5 @@ dependencies {
}

jsonSchema2Pojo {
targetPackage = 'io.dataline.conduit.conduit_config'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party but it is weird to have conduit repeated twice. why not io.dataline.conduit.config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was just pattern matching other module names. the extra namespace can be dropped.

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
package io.dataline.conduit.conduit_config;

public enum ConfigSchema {
SOURCE_CONNECTION_CONFIGURATION("SourceConnectionConfiguration.json"),
STANDARD_CONNECTION_STATUS("StandardConnectionStatus.json"),
STANDARD_DISCOVERY_OUTPUT("StandardDiscoveryOutput.json"),
DESTINATION_CONNECTION_CONFIGURATION("DestinationConnectionConfiguration.json"),
STANDARD_SYNC_CONFIGURATION("StandardSyncConfiguration.json"),
STANDARD_SYNC_SUMMARY("StandardSyncSummary.json"),
STANDARD_SYNC_STATE("StandardSyncState.json"),
STANDARD_SYNC_SCHEDULE("StandardSyncSchedule.json"),
STATE("State.json");

private final String schemaFilename;

ConfigSchema(String schemaFilename) {
this.schemaFilename = schemaFilename;
}

public String getSchemaFilename() {
return schemaFilename;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "DestinationConnectionConfiguration",
"title": "DestinationConnectionConfiguration",
"description": "information required for connection to a destination.",
"type": "object",
"required": ["destinationSpecificationId", "configuration"],
"additionalProperties": false,
"properties": {
"destinationSpecificationId": {
"type": "string",
"format": "uuid"
},
"configuration": {
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "SourceConnectionConfiguration",
"title": "SourceConnectionConfiguration",
"description": "information required for connection to a destination.", "type": "object",
"required": ["sourceSpecificationId", "configuration"],
"additionalProperties": false,
"properties": {
"sourceSpecificationId": {
"type": "string",
"format": "uuid"
},
"configuration": {
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"description": "describes the result of a 'test connection' action.",
"type": "object",
"required": ["status"],
"additionalProperties": false,
"properties": {
"status": {
"type": "string",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
"schema": {
"description": "describes the available schema.",
"type": "object",
"required": ["tables"],
"additionalProperties": false,
"properties": {
"tables": {
"type": "array",
Expand All @@ -16,6 +18,7 @@
"name",
"columns"
],
"additionalProperties": false,
"properties": {
"name": {
"type": "string"
Expand All @@ -28,6 +31,7 @@
"name",
"dataType"
],
"additionalProperties": false,
"properties": {
"name": {
"type": "string"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"description": "describes the standard output for any discovery run.",
"type": "object",
"required": ["schema"],
"additionalProperties": false,
"properties": {
"schema": {
"description": "describes the available schema.",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
"title": "StandardSyncConfiguration",
"description": "configuration required for sync for ALL taps",
"type": "object",
"required": ["syncMode", "schema"],
"additionalProperties": false,
"properties": {
"syncMode": {
"type": "string",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "StandardScheduleConfiguration",
"$id": "https://dataline.io/docs/models/StandardScheduleConfiguration.json",
"title": "StandardScheduleConfiguration",
"type": "object",
"required": ["timeUnit", "units"],
"additionalProperties": false,
"properties": {
"timeUnit": {
"type": "string",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
"title": "StandardSyncSummary",
"description": "standard information output by ALL taps for a sync step (our version of state.json)",
"type": "object",
"required": ["attemptId", "tables", "recordsSynced", "version", "status", "startTime", "endTime"],
"additionalProperties": false,
"properties": {
"attemptId": {
"type": "string",
Expand Down
17 changes: 17 additions & 0 deletions conduit-config/src/main/resources/json/State.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "DestinationConnectionConfiguration",
"title": "DestinationConnectionConfiguration",
"description": "information output by the connection.",
"type": "object",
"required": ["connectionId", "state"],
"additionalProperties": false,
"properties": {
"connectionId": {
"type": "string",
"format": "uuid"
},
"state": {
}
}
}
Loading