Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Fixes for Release 2.4.1 #1622

Merged
merged 22 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
848b721
* update renameFileOptions in fuse2 handler.
syeleti-msft Feb 4, 2025
d293bc8
Modify test for attribute cache
syeleti-msft Feb 4, 2025
4d11c5b
Cleanup the code
syeleti-msft Feb 4, 2025
e826005
* Modify the Etag of dst attribute when renaming the file
syeleti-msft Feb 5, 2025
1eebe3c
Modify comments
syeleti-msft Feb 5, 2025
7c129c1
Update comments
syeleti-msft Feb 5, 2025
4e69519
* Add atomic_o_trunc flag in fuse2 options that will allow the fuse l…
syeleti-msft Feb 5, 2025
8129b48
fix go static test
syeleti-msft Feb 5, 2025
029fea8
Add test for O_TRUNC flag
syeleti-msft Feb 6, 2025
740230f
Check where crash happened
syeleti-msft Feb 6, 2025
09526c0
Take lock before accessing the value from map inside the handle
syeleti-msft Feb 6, 2025
effd80c
Fix accessing the Etag causing panic in Download method of blockcache
syeleti-msft Feb 6, 2025
d4b58db
Spell check
syeleti-msft Feb 7, 2025
d952727
Use only one interface for apt(i.e., use apt-get always)
syeleti-msft Feb 7, 2025
69a1cd2
Merge branch 'nytlyfail' into nytly-pp-test
syeleti-msft Feb 7, 2025
1adb32b
finding out some strange behaviour
syeleti-msft Feb 8, 2025
c97955d
Support O_TRUNC flag in filecache
syeleti-msft Feb 8, 2025
9324ce0
Update Changelog
syeleti-msft Feb 8, 2025
915f1b8
Merge branch 'main' into syeleti/missingReleaseChanges
vibhansa-msft Feb 10, 2025
ef73745
Update CHANGELOG.md
syeleti-msft Feb 10, 2025
f7017e9
Downgrade AzDatalake SDK version
syeleti-msft Feb 11, 2025
43c7e99
Add unit test
syeleti-msft Feb 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
- Correct statFS results to reflect block-cache in memory cache status.
- Do not wipeout temp-cache on start after a un-graceful unmount, if `cleanup-on-start` is not configured in file-cache.
- When the subdirectory is mounted and there is some file/folder operation, remove only the subdirectory path from the file paths.
- Enable atomic_o_trunc flag in libfuse to allow O_TRUNC flag to come in the open call for fuse2.
- In file-cache, when the O_TRUNC flag is passed to the open call and no modifications were done to the file before closing it then update the file in the Azure Storage to size 0.

**Other Changes**
- Optimized listing operation on HNS account to support symlinks.
Expand Down
Empty file.
11 changes: 0 additions & 11 deletions blobfuse2-nightly.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,6 @@ stages:
steps:
- script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git parallel -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -243,7 +242,6 @@ stages:
steps:
- script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git parallel -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -383,7 +381,6 @@ stages:
steps:
- script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -506,7 +503,6 @@ stages:
steps:
- script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -620,7 +616,6 @@ stages:
steps:
- script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc libfuse3-dev git -y
displayName: 'Install libfuse'

Expand Down Expand Up @@ -1385,7 +1380,6 @@ stages:
installStep:
script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git parallel -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -1462,7 +1456,6 @@ stages:
installStep:
script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git parallel -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -1541,7 +1534,6 @@ stages:
installStep:
script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git parallel -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -1639,7 +1631,6 @@ stages:
installStep:
script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git parallel -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -1716,7 +1707,6 @@ stages:
installStep:
script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git parallel -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down Expand Up @@ -1793,7 +1783,6 @@ stages:
installStep:
script: |
sudo apt-get update --fix-missing
sudo apt update
sudo apt-get install cmake gcc $(fuselib) git parallel -y
if [ $(tags) == "fuse2" ]; then
sudo apt-get install fuse -y
Expand Down
7 changes: 6 additions & 1 deletion component/attr_cache/attr_cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,11 @@ func (ac *AttrCache) TruncateFile(options internal.TruncateFileOptions) error {
if found && value.valid() && value.exists() {
value.setSize(options.Size)
}
// todo: invalidating path here rather than updating with etag
// due to some changes that are required in az storage comp which
// were not necessarily required. Once they were done invalidation
// of the attribute can be removed.
ac.invalidatePath(options.Name)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalidate the attribute in attribute cache after doing a truncate operation on file. This is sort of quick hack done to avoid the complexity of truncate file in az storage component.

}
return err
}
Expand Down Expand Up @@ -589,7 +594,7 @@ func (ac *AttrCache) CommitData(options internal.CommitDataOptions) error {
if err == nil {
ac.cacheLock.RLock()
defer ac.cacheLock.RUnlock()

// TODO: Could we just update the size, etag, modtime of the file here?
ac.invalidatePath(options.Name)
}
return err
Expand Down
13 changes: 7 additions & 6 deletions component/attr_cache/attr_cache_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -815,12 +815,13 @@ func (suite *attrCacheTestSuite) TestTruncateFile() {

err = suite.attrCache.TruncateFile(options)
suite.assert.Nil(err)
suite.assert.Contains(suite.attrCache.cacheMap, path)
suite.assert.NotEqualValues(suite.attrCache.cacheMap[path].attr, &internal.ObjAttr{})
suite.assert.EqualValues(suite.attrCache.cacheMap[path].attr.Size, size) // new size should be set
suite.assert.EqualValues(suite.attrCache.cacheMap[path].attr.Mode, defaultMode)
suite.assert.True(suite.attrCache.cacheMap[path].valid())
suite.assert.True(suite.attrCache.cacheMap[path].exists())
// suite.assert.Contains(suite.attrCache.cacheMap, path)
// suite.assert.NotEqualValues(suite.attrCache.cacheMap[path].attr, &internal.ObjAttr{})
// suite.assert.EqualValues(suite.attrCache.cacheMap[path].attr.Size, size) // new size should be set
// suite.assert.EqualValues(suite.attrCache.cacheMap[path].attr.Mode, defaultMode)
// suite.assert.True(suite.attrCache.cacheMap[path].valid())
// suite.assert.True(suite.attrCache.cacheMap[path].exists())
suite.assert.False(suite.attrCache.cacheMap[path].valid())
}

// Tests CopyFromFile
Expand Down
13 changes: 8 additions & 5 deletions component/azstorage/block_blob.go
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,7 @@ func (bb *BlockBlob) DeleteDirectory(name string) (err error) {
// Source file must exist in storage account before calling this method.
// When the rename is success, Data, metadata, of the blob will be copied to the destination.
// Creation time and LMT is not preserved for copyBlob API.
// Etag of the destination blob changes.
// Copy the LMT to the src attr if the copy is success.
// https://learn.microsoft.com/en-us/rest/api/storageservices/copy-blob?tabs=microsoft-entra-id
func (bb *BlockBlob) RenameFile(source string, target string, srcAttr *internal.ObjAttr) error {
Expand Down Expand Up @@ -326,6 +327,7 @@ func (bb *BlockBlob) RenameFile(source string, target string, srcAttr *internal.
}

var dstLMT *time.Time = copyResponse.LastModified
var dstETag string = sanitizeEtag(copyResponse.ETag)

copyStatus := copyResponse.CopyStatus
var prop blob.GetPropertiesResponse
Expand All @@ -344,10 +346,11 @@ func (bb *BlockBlob) RenameFile(source string, target string, srcAttr *internal.

if pollCnt > 0 {
dstLMT = prop.LastModified
dstETag = sanitizeEtag(prop.ETag)
}

if copyStatus != nil && *copyStatus == blob.CopyStatusTypeSuccess {
modifyLMT(srcAttr, dstLMT)
modifyLMTandEtag(srcAttr, dstLMT, dstETag)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the ETag for dst attribute while renaming the file

}

log.Trace("BlockBlob::RenameFile : %s -> %s done", source, target)
Expand Down Expand Up @@ -453,7 +456,7 @@ func (bb *BlockBlob) getAttrUsingRest(name string) (attr *internal.ObjAttr, err
Crtime: *prop.CreationTime,
Flags: internal.NewFileBitMap(),
MD5: prop.ContentMD5,
ETag: strings.Trim(string(*prop.ETag), `"`),
ETag: sanitizeEtag(prop.ETag),
}

parseMetadata(attr, prop.Metadata)
Expand Down Expand Up @@ -662,7 +665,7 @@ func (bb *BlockBlob) getBlobAttr(blobInfo *container.BlobItem) (*internal.ObjAtt
Crtime: bb.dereferenceTime(blobInfo.Properties.CreationTime, *blobInfo.Properties.LastModified),
Flags: internal.NewFileBitMap(),
MD5: blobInfo.Properties.ContentMD5,
ETag: strings.Trim((string)(*blobInfo.Properties.ETag), `"`),
ETag: sanitizeEtag(blobInfo.Properties.ETag),
}

parseMetadata(attr, blobInfo.Metadata)
Expand Down Expand Up @@ -940,7 +943,7 @@ func (bb *BlockBlob) ReadInBuffer(name string, offset int64, len int64, data []b
}

if etag != nil {
*etag = strings.Trim(string(*downloadResponse.ETag), `"`)
*etag = sanitizeEtag(downloadResponse.ETag)
}

return nil
Expand Down Expand Up @@ -1611,7 +1614,7 @@ func (bb *BlockBlob) CommitBlocks(name string, blockList []string, newEtag *stri
}

if newEtag != nil {
*newEtag = strings.Trim(string(*resp.ETag), `"`)
*newEtag = sanitizeEtag(resp.ETag)
}

return nil
Expand Down
5 changes: 3 additions & 2 deletions component/azstorage/datalake.go
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,7 @@ func (dl *Datalake) DeleteDirectory(name string) (err error) {

// RenameFile : Rename the file
// While renaming the file, Creation time is preserved but LMT is changed for the destination blob.
// and also Etag of the destination blob changes
func (dl *Datalake) RenameFile(source string, target string, srcAttr *internal.ObjAttr) error {
log.Trace("Datalake::RenameFile : %s -> %s", source, target)

Expand All @@ -337,7 +338,7 @@ func (dl *Datalake) RenameFile(source string, target string, srcAttr *internal.O
return err
}
}
modifyLMT(srcAttr, renameResponse.LastModified)
modifyLMTandEtag(srcAttr, renameResponse.LastModified, sanitizeEtag(renameResponse.ETag))
return nil
}

Expand Down Expand Up @@ -400,7 +401,7 @@ func (dl *Datalake) GetAttr(name string) (blobAttr *internal.ObjAttr, err error)
Ctime: *prop.LastModified,
Crtime: *prop.LastModified,
Flags: internal.NewFileBitMap(),
ETag: (string)(*prop.ETag),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanitize the ETag in datalake before adding it to the attribute

ETag: sanitizeEtag(prop.ETag),
}
parseMetadata(blobAttr, prop.Metadata)

Expand Down
10 changes: 9 additions & 1 deletion component/azstorage/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -590,14 +590,22 @@ func removeLeadingSlashes(s string) string {
return s
}

func modifyLMT(attr *internal.ObjAttr, lmt *time.Time) {
func modifyLMTandEtag(attr *internal.ObjAttr, lmt *time.Time, ETag string) {
if attr != nil {
attr.Atime = *lmt
attr.Mtime = *lmt
attr.Ctime = *lmt
attr.ETag = ETag
}
}

func sanitizeEtag(ETag *azcore.ETag) string {
if ETag != nil {
return strings.Trim(string(*ETag), `"`)
}
return ""
}

// func parseBlobTags(tags *container.BlobTags) map[string]string {

// if tags == nil {
Expand Down
13 changes: 13 additions & 0 deletions component/azstorage/utils_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ import (
"strconv"
"testing"

"github.com/Azure/azure-sdk-for-go/sdk/azcore"
"github.com/Azure/azure-sdk-for-go/sdk/azcore/to"
"github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/blob"
"github.com/Azure/azure-storage-fuse/v2/common"
Expand Down Expand Up @@ -281,6 +282,18 @@ func (s *utilsTestSuite) TestSanitizeSASKey() {
assert.EqualValues("?abcd", key)
}

func (s *utilsTestSuite) TestSanitizeEtag() {
assert := assert.New(s.T())

etagValue := azcore.ETag("\"abcd\"")
etag := sanitizeEtag(&etagValue)
assert.EqualValues(etag, "abcd")

etagValue = azcore.ETag("abcd")
etag = sanitizeEtag(&etagValue)
assert.EqualValues(etag, "abcd")
}

func (s *utilsTestSuite) TestBlockNonProxyOptions() {
assert := assert.New(s.T())
opt, err := getAzBlobServiceClientOptions(&AzStorageConfig{})
Expand Down
11 changes: 9 additions & 2 deletions component/block_cache/block_cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -938,12 +938,18 @@ func (bc *BlockCache) refreshBlock(handle *handlemap.Handle, index uint64, prefe

// lineupDownload : Create a work item and schedule the download
func (bc *BlockCache) lineupDownload(handle *handlemap.Handle, block *Block, prefetch bool) {
Copy link
Collaborator

@jainakanksha-msft jainakanksha-msft Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kidly add junit test case, where in case of lineing up we need to set Etag.

IEtag, found := handle.GetValue("ETAG")
Etag := ""
if found {
Etag = IEtag.(string)
}
item := &workItem{
handle: handle,
block: block,
prefetch: prefetch,
failCnt: 0,
upload: false,
ETag: Etag,
}

// Remove this block from free block list and add to in-process list
Expand Down Expand Up @@ -1055,8 +1061,7 @@ func (bc *BlockCache) download(item *workItem) {

// Compare the ETAG value and fail download if blob has changed
if etag != "" {
etagVal, found := item.handle.GetValue("ETAG")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Send ETag as a struct field to the workitem, as directly retrieving etag from the handle inside the download method is not safe as it might cause a race. seems like Locking the handle to retrieve Etag inside the download method is also a bad idea, as we don't know the handle is already locked/unlocked.

if found && etagVal != etag {
if item.ETag != "" && item.ETag != etag {
log.Err("BlockCache::download : Blob has changed for %v=>%s (index %v, offset %v)", item.handle.ID, item.handle.Path, item.block.id, item.block.offset)
item.block.Failed()
item.block.Ready(BlockStatusDownloadFailed)
Expand Down Expand Up @@ -1531,6 +1536,7 @@ return_safe:
}

// Stage the given number of blocks from this handle
// handle lock must be taken before calling this function
func (bc *BlockCache) commitBlocks(handle *handlemap.Handle) error {
log.Debug("BlockCache::commitBlocks : Staging blocks for %s", handle.Path)

Expand Down Expand Up @@ -1580,6 +1586,7 @@ func (bc *BlockCache) commitBlocks(handle *handlemap.Handle) error {
return err
}

// Lock was already acquired on the handle.
if newEtag != "" {
handle.SetValue("ETAG", newEtag)
}
Expand Down
10 changes: 10 additions & 0 deletions component/block_cache/threadpool.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,18 @@ type workItem struct {
failCnt int32 // How many times this item has failed to download
upload bool // Flag marking this is a upload request or not
blockId string // BlockId of the block
ETag string // Etag of the file before scheduling.
}

// Reason for storing Etag in workitem struct:
// here getting the value of ETag inside upload/download methods
// from the handle is somewhat tricker.
// firstly we need to acquire a lock to read it from the handle.
// In these methods the handle may/maynot be locked by
// other go routine hence acquiring it again would cause a deadlock.
// It is already locked if the call came from the readInBuffer.
// It is may be locked if the call come from the prefetch.

// newThreadPool creates a new thread pool
func newThreadPool(count uint32, reader func(*workItem), writer func(*workItem)) *ThreadPool {
if count == 0 || reader == nil {
Expand Down
4 changes: 3 additions & 1 deletion component/file_cache/file_cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -1042,6 +1042,9 @@ func (fc *FileCache) OpenFile(options internal.OpenFileOptions) (*handlemap.Hand
flock.Inc()

handle := handlemap.NewHandle(options.Name)
if options.Flags&os.O_TRUNC != 0 {
handle.Flags.Set(handlemap.HandleFlagDirty)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When O_TRUNC flag is passed to the open call in file cache and closed immediately without any writes on the data, the old data inside the file is not getting wiped out from the server. fixed this issue

}
inf, err := f.Stat()
if err == nil {
handle.Size = inf.Size()
Expand Down Expand Up @@ -1217,7 +1220,6 @@ func (fc *FileCache) WriteFile(options internal.WriteFileOptions) (int, error) {
if err == nil {
// Mark the handle dirty so the file is written back to storage on FlushFile.
options.Handle.Flags.Set(handlemap.HandleFlagDirty)

} else {
log.Err("FileCache::WriteFile : failed to write %s [%s]", options.Handle.Path, err.Error())
}
Expand Down
13 changes: 12 additions & 1 deletion component/libfuse/libfuse2_handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,12 @@ func populateFuseArgs(opts *C.fuse_options_t, args *C.fuse_args_t) (*C.fuse_opti
options += fmt.Sprintf(",umask=%04d", opts.umask)
}

// force the fuse library to always pass O_TRUNC flag on open call
// Not checking the options since we don't allow user to configure this flag.
// This is the default behaviour for the fuse3 hence we don't pass this flag there.
// ref: https://github.com/libfuse/libfuse/blob/7f86f3be7148b15b71b63357813c66dd32177cf6/lib/fuse_lowlevel.c#L2161C2-L2161C16
options += ",atomic_o_trunc"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add atomic_o_trunc flag in fuse2 options that will allow the fuse library to always pass the O_TRUNC flag on open call. see the comments in the code for more details


// direct_io option is used to bypass the kernel cache. It disables the use of
// page cache (file content cache) in the kernel for the filesystem.
if fuseFS.directIO {
Expand Down Expand Up @@ -920,7 +926,12 @@ func libfuse2_rename(src *C.char, dst *C.char) C.int {
libfuseStatsCollector.UpdateStats(stats_manager.Increment, renameDir, (int64)(1))

} else {
err := fuseFS.NextComponent().RenameFile(internal.RenameFileOptions{Src: srcPath, Dst: dstPath})
err := fuseFS.NextComponent().RenameFile(internal.RenameFileOptions{
Src: srcPath,
Dst: dstPath,
SrcAttr: srcAttr,
DstAttr: dstAttr,
})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update renameFileOptions in fuse2 handler

if err != nil {
log.Err("Libfuse::libfuse2_rename : error renaming file %s -> %s [%s]", srcPath, dstPath, err.Error())
return -C.EIO
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ require (
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.17.0
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.8.1
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.6.0
github.com/Azure/azure-sdk-for-go/sdk/storage/azdatalake v1.4.0
github.com/Azure/azure-sdk-for-go/sdk/storage/azdatalake v1.4.0-beta.1
github.com/JeffreyRichter/enum v0.0.0-20180725232043-2567042f9cda
github.com/fsnotify/fsnotify v1.8.0
github.com/golang/mock v1.6.0
Expand Down
Loading
Loading