Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ai-help): delete ai history after six months #437

Merged
merged 8 commits into from
Mar 18, 2024
Merged
1 change: 1 addition & 0 deletions .settings.dev.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,5 +51,6 @@ flag_repo = "flags"
[ai]
api_key = ""
limit_reset_duration_in_sec = 3600
history_deletion_period_in_sec = 15_778_476
trigger_error_for_search_term = "Please give me an error in the search phase of AI conversation"
trigger_error_for_chat_term = "Please give me an error in the chat phase of the AI conversation"
7 changes: 4 additions & 3 deletions .settings.local.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@ scopes = "openid profile email profile:subscriptions"
auth_cookie_name = "auth-cookie"
login_cookie_name = "login-cookie"
auth_cookie_secure = false
client_id="TEST_CLIENT_ID"
client_secret="TEST_CLIENT_SECRET"
client_id = "TEST_CLIENT_ID"
client_secret = "TEST_CLIENT_SECRET"
cookie_key = "DUwIFZuUYzRhHPlhOm6DwTHSDUSyR5SyvZHIeHdx4DIanxm5/GD/4dqXROLvn5vMofOYUq37HhhivjCyMCWP4w=="
admin_update_bearer_token="TEST_TOKEN"
admin_update_bearer_token = "TEST_TOKEN"

[application]
document_base_url = "https://developer.allizom.org"
Expand Down Expand Up @@ -51,3 +51,4 @@ flag_repo = "flags"
[ai]
api_key = ""
limit_reset_duration_in_sec = 3600
history_deletion_period_in_sec = 15_778_476
1 change: 1 addition & 0 deletions .settings.test.toml
Original file line number Diff line number Diff line change
Expand Up @@ -52,5 +52,6 @@ flag_repo = "flags"
limit_reset_duration_in_sec = 5
api_key = ""
explain_sign_key = "kmMAMku9PB/fTtaoLg82KjTvShg8CSZCBUNuJhUz5Pg="
history_deletion_period_in_sec = 15_778_476
trigger_error_for_search_term = "Please give me an error in the search phase of the AI conversation"
trigger_error_for_chat_term = "Please give me an error in the chat phase of the AI conversation"
20 changes: 20 additions & 0 deletions src/api/admin.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
use crate::db::ai_history::do_delete_old_ai_history;
use crate::db::v2::synchronize_bcd_updates_db::update_bcd;
use crate::db::Pool;
use crate::settings::SETTINGS;
use actix_rt::ArbiterHandle;
use actix_web::dev::{HttpServiceFactory, ServiceRequest};
use actix_web::web::Data;
use actix_web::HttpResponse;
use actix_web::{web, Error};
use actix_web_httpauth::extractors::bearer::BearerAuth;
use actix_web_httpauth::middleware::HttpAuthentication;
Expand All @@ -22,4 +27,19 @@ pub fn admin_service() -> impl HttpServiceFactory {
web::scope("/admin-api")
.wrap(HttpAuthentication::bearer(validator))
.service(web::resource("/v2/updates/").route(web::post().to(update_bcd)))
.service(web::resource("/ai-history/").route(web::post().to(delete_old_ai_history)))
}

pub async fn delete_old_ai_history(
pool: Data<Pool>,
arbiter: Data<ArbiterHandle>,
) -> Result<HttpResponse, ApiError> {
if !arbiter.spawn(async move {
if let Err(e) = do_delete_old_ai_history(pool).await {
error!("{}", e);
}
Comment on lines +38 to +40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering out loud if this would be "more" idiomatic:

Suggested change
if let Err(e) = do_delete_old_ai_history(pool).await {
error!("{}", e);
}
do_delete_old_ai_history(pool)
.await
.map_err(|e| error!("{}", e))
.ok();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks more rustful.

}) {
return Ok(HttpResponse::InternalServerError().finish());
}
Ok(HttpResponse::Accepted().finish())
}
35 changes: 35 additions & 0 deletions src/db/ai_history.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
use crate::db::schema::ai_help_history;
use crate::db::Pool;
use crate::diesel::QueryDsl;
use crate::{api::error::ApiError, settings::SETTINGS};
use actix_web::web::Data;
use chrono::Utc;
use diesel::{ExpressionMethods, RunQueryDsl};
use std::ops::Sub;
use std::time::Duration;

/// This removes old AI history records from the database. It is meant to be called from a
/// cron job calling the respective endpoint in the admin API.
pub async fn do_delete_old_ai_history(pool: Data<Pool>) -> Result<(), ApiError> {
let mut conn = pool.get()?;
let history_deletion_period_in_sec = SETTINGS
.ai
.as_ref()
.map(|ai| ai.history_deletion_period_in_sec)
.ok_or(ApiError::Generic(
"ai.history_deletion_period_in_sec missing from configuration".to_string(),
))?;

let oldest_timestamp = Utc::now()
.sub(Duration::from_secs(history_deletion_period_in_sec))
.naive_utc();

let affected_rows = diesel::delete(
ai_help_history::table.filter(ai_help_history::updated_at.lt(oldest_timestamp)),
)
.execute(&mut conn)?;
info!(
"Deleted old AI history before {oldest_timestamp}: {affected_rows} old record(s) deleted."
);
Ok(())
}
1 change: 1 addition & 0 deletions src/db/mod.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
pub mod ai_explain;
pub mod ai_help;
pub mod ai_history;
pub mod documents;
pub mod error;
pub mod fxa_webhook;
Expand Down
1 change: 1 addition & 0 deletions src/settings.rs
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ pub struct AI {
pub limit_reset_duration_in_sec: i64,
#[serde_as(as = "Base64")]
pub explain_sign_key: [u8; 32],
pub history_deletion_period_in_sec: u64,
}

#[serde_as]
Expand Down
109 changes: 108 additions & 1 deletion tests/api/ai_help_history.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,19 @@ use actix_web::test;
use anyhow::Error;
use async_openai::types::ChatCompletionRequestMessage;
use async_openai::types::Role::{Assistant, User};
use chrono::{NaiveDateTime, Utc};
use diesel::dsl::count;
use diesel::prelude::*;
use diesel::{insert_into, ExpressionMethods, RunQueryDsl};
use rumba::ai::help::RefDoc;
use rumba::db::ai_help::{add_help_history, add_help_history_message};
use rumba::db::model::{AIHelpHistoryMessageInsert, SettingsInsert};
use rumba::db::model::{AIHelpHistoryInsert, AIHelpHistoryMessageInsert, SettingsInsert};
use rumba::db::schema::ai_help_history;
use rumba::db::settings::create_or_update_settings;
use rumba::settings::SETTINGS;
use serde_json::Value::Null;
use std::ops::Sub;
use std::time::Duration;
use uuid::Uuid;

const CHAT_ID: Uuid = Uuid::nil();
Expand Down Expand Up @@ -95,6 +103,105 @@ async fn test_history() -> Result<(), Error> {
Ok(())
}

#[actix_rt::test]
#[stubr::mock(port = 4321)]
async fn test_history_deletion() -> Result<(), Error> {
let pool = reset()?;
let app = test_app_with_login(&pool).await.unwrap();
let service = test::init_service(app).await;
let mut _logged_in_client = TestHttpClient::new(&service).await;
let mut conn = pool.get()?;

let history_deletion_period_in_sec = &SETTINGS
.ai
.as_ref()
.map(|ai| ai.history_deletion_period_in_sec)
.expect("ai.history_deletion_period_in_sec missing");

// Add an old chat history entry, double our configured period ago.
let ts = Utc::now()
.sub(Duration::from_secs(history_deletion_period_in_sec * 2))
.naive_utc();
let history = AIHelpHistoryInsert {
user_id: 1,
chat_id: Uuid::from_u128(1),
created_at: Some(ts),
updated_at: Some(ts),
label: "old entry".to_string(),
};
insert_into(ai_help_history::table)
.values(history)
.execute(&mut conn)?;

// Add a newer chat history entry, half of our configured period ago.
let ts = Utc::now()
// .checked_sub_months(Months::new(2))
.sub(Duration::from_secs(history_deletion_period_in_sec / 2))
.naive_utc();
let history = AIHelpHistoryInsert {
user_id: 1,
chat_id: Uuid::from_u128(2),
created_at: Some(ts),
updated_at: Some(ts),
label: "new entry".to_string(),
};
insert_into(ai_help_history::table)
.values(history)
.execute(&mut conn)?;

// check the history count before we run the delete job
let rec_count = ai_help_history::table
.filter(ai_help_history::user_id.eq(1))
.select(count(ai_help_history::user_id))
.first::<i64>(&mut conn)?;

assert_eq!(2, rec_count);

// Now, run the delete job.
let req = test::TestRequest::post()
.uri("/admin-api/ai-history/")
.insert_header((
"Authorization",
format!("Bearer {}", SETTINGS.auth.admin_update_bearer_token),
))
.to_request();

let res = test::call_service(&service, req).await;
assert!(res.response().status().is_success());

// Check database that the old entry is gone.
// Loop until we see the old entry is gone because the
// delete job runs asynchonously.
let mut retry = 0;
const MAX_RETRIES: u32 = 40;
Comment on lines +175 to +176
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why you use u32, since u8 would be sufficient for both, and the difference is probably negligible here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, u8 would suffice here. It is test code though, so I we can be a bit more relaxed about this.

let mut records: Vec<NaiveDateTime>;
loop {
records = ai_help_history::table
.filter(ai_help_history::user_id.eq(1))
.select(ai_help_history::updated_at)
.get_results(&mut conn)?;
if records.len() == 1 {
break;
}

actix_rt::time::sleep(Duration::from_millis(5)).await;
retry += 1;
if retry > MAX_RETRIES {
break;
}
}
Comment on lines +178 to +192
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be made more readable by using the retry crate, and we could reuse it to stabilize those flaky tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably. Lets look into retry crate next time we have flakyness issues.


assert_eq!(1, records.len());
// we only compare up to millisecond precision
assert_eq!(
ts.timestamp_micros(),
records.get(0).unwrap().timestamp_micros()
);

drop_stubr(stubr).await;
Ok(())
}

fn normalize_digits(s: &str) -> String {
let mut result = String::new();

Expand Down
Loading